What is a black box? A computer scientist explains what it means when the inner workings of AIs are hidden

For some people, the term “black box” brings to mind the recording devices in airplanes that are valuable for postmortem analyses if the unthinkable happens. For others it evokes small, minimally outfitted theaters. But black box is also an important term in the world of artificial intelligence.

AI black boxes refer to AI systems with internal workings that are invisible to the user. You can feed them input and get output, but you cannot examine the system’s code or the logic that produced the output.

Machine learning is the dominant subset of artificial intelligence. It underlies generative AI systems like ChatGPT and DALL-E 2. There are three components to machine learning: an algorithm or a set of algorithms, training data and a model. An algorithm is a set of procedures. In machine learning, an algorithm learns to identify patterns after being trained on a large set of examples – the training data. Once a machine-learning algorithm has been trained, the result is a machine-learning model. The model is what people use.

Why AI black boxes matter

In many cases, there is good reason to be wary of black box machine-learning algorithms and models. Suppose a machine-learning model has made a diagnosis about your health. Would you want the model to be black box or glass box? What about the physician prescribing your course of treatment? Perhaps she would like to know how the model arrived at its decision.

What if a machine-learning model that determines whether you qualify for a business loan from a bank turns you down? Wouldn’t you like to know why? If you did, you could more effectively appeal the decision, or change your situation to increase your chances of getting a loan the next time.