Bias In, Bias Out
Is AI and machine learning helping to perpetuate bias?
We hear and read much about how AI is not a technology that will help eliminate the bias of race, gender, and culture. AI seems to mirror the entranced racial norms that exist in atomic (as opposed to digital) reality. There are many books about the crushing biases that algorithms feed into digital decision-making. The biases hurt African Americans in the already established social issues such as jobs, loans, education, and health.
Fairness has not been achieved. AI reinforces discrimination, undermining great efforts to eliminate this societal scourge. The mathematical models and data used are not regulated, and the industry lacks the diversity to counter the tendency of the algorithms to match the culture of human programmers.
The core of the problem is the decisions AI makes. When any decisions are made, there will always be bias. Part of the purpose of AI is to be biased to filter out what is undesirable and make a decision that would match the algorithm’s goal. Some of these decisions can be unjust or unfair, especially to groups that are already marginalized.
Phillip Jansen et al. define AI as the science and engineering of machines with capabilities considered intelligent by the standard of human intelligence. That is not the way AI really is. How can it be? AI’s decision-making lacks relevance, understanding, compassion, experience, sensitivity, and wisdom. Making decisions based entirely on math (though sometimes very helpful) is not what I would associate with standard human intelligence.
An algorithm is a set of instructions. We mostly compare it to a roadmap that tells the computer where to go (what to do), which leads to an output based on available information. This is simple enough, but sometimes technology gets a little ahead of us before we can reel it back in. Programmers know the architecture of a neutral network for machine learning. Still, they do not understand what really happens between input and output when a decision is made based on a machine learning algorithm.
Machine learning is software that can learn and is based on statistics. The main task is pattern recognition. Algorithms identify rules/patterns that explain (analyze) the data. Predictions can be made from the data. If machine learning is supervised, humans provide feedback based on positive and negative, somewhat like yes and no designations. The algorithm focuses on one variable that is designated as the prediction target.
Imagine a goal to split a group of people into two categories: people with the proper traits to receive a security clearance and people with the characteristics that would deny a security clearance. Some of the variables for something like this are already well known; education level, experience, degree, awards, professional affiliations, etc. The algorithm learns how to divide the two groups. The programmer uses examples and non-examples to train the systems. The system learns how to predict what traits will easily get a security clearance and the traits that will prevent someone from getting a security clearance. If the system is fed enough examples, a sort of digital template is formed.
Unsupervised machine learning means there is no human programmer feeding examples. Categories are unknown, and algorithms make their own clusters. Security clearance categories are made by AI itself. The two groups are now based on the variables AI selects. No programmer is needed.
In these examples of supervised and unsupervised machine learning being used to divide these two groups, the results could be biased based on the data that is being analyzed. If a race is part of the original data provided, the race data point will be added to all the other data. If that data was not good in the first place, considering such factors as the age of the data, the amount of data, and the demographics, things can go awry.
For example, if the data was taken from the entire general population or the current population of security clearance holders. In the latter case, the data collected would simply mirror the traits of the subject. Making the template of a white man between the ages of 25–50 and all that goes along with that. In the former case, even more people will be deleted from a favorable decision because they do not have traits that match the established norm.
This type of careless and rotten data collection goes on all the time. A very high percentage of data biases are unintentional. Since data collection is not adequately regulated, we have systems that help spread the gaps between rich and poor, the marginalized and the celebrated. People will be judged based on their zip codes, names, and skin color. That is not the only way African Americans get a raw deal with AI. AI is also used to invade the privacy of private citizens.
When innovation is born, the people on the lowest rung of the social and economic pyramid are the guinea pigs of the new innovations. I wrote a technology column for a local newspaper a few years ago. At that time, the City of Colorado Springs deployed sound sensors in some neighborhoods. These sensors are to be used for gunshot detection. The sensor picks up the sound of gunfire, the sound is triangulated, and police are automatically called to the scene. These systems also identify the caliber of the gun used and count the number of shooters (they claim). The local police were not much help when I inquired exactly where these gunshot detectors were being placed. However, I did not need them to tell me. I already knew; the neighborhoods with the highest crime rates. This turned out to be correct. This is the communities with the most African American and Latino citizens, and these communities also have the lowest income levels. Civil liberties can be violated. Privacy is an issue. Conversations can also be picked up with those exact technologies, and there is not enough regulation to keep this technology in its lane. Even though these gunshot sensors have not proved effective in fighting crime, there is no doubt they will remain in place. Sooner or later, the technology will be given a new assignment for listening and analyzing the rhythm of a community. The data collected will benefit some entrepreneurial entities willing and ready to exploit the rich, big data mine.
Then there’s crime prediction software that law enforcement agencies use to identify and analyze crime activity based on every type of personal data, from court record cases to unemployment trends. Where does that data go, who owns it, and for what purposes will it be used beyond its stated use? Just as with other encounters with AI, people of color will be on the losing end of these encounters. Just as in law enforcement, once you are in the system, you stay in it, if not physically; it can be emotionally, mentally, and socially. On top of that, you will remain open to tracking and monitoring when you apply for financial assistance, home loans, healthcare insurance, etc.
Personal experiences, opinions, and prejudices creep into the process, affecting entire communities. Building diversity in the industry is a great way to start to correct some of the problems, at least. Diversified development and data science teams can offset some of the adverse effects of AI machine learning on communities. Recruiting those people into computer science and technology careers is not easy. As an African American educator working for a non-profit that provides free STEM training and industry certifications to underrepresented communities, my experience informs me that the entire educational system has to be changed.
All elementary-aged students must have all things digital introduced to children in preschool and be maintained and linked together practically throughout high school. We must make digital education, including information technology and cybersecurity, common core and entrenched in every course just as English and math are, two skills we practice every waking hour in one form or another. We rarely display the history skills we learned in school. Still, we demonstrate our digital skills every time we use our phones, listen to music, make online purchases, watch entertainment, or fill out online forms. Social media platforms, search engines, commerce, security, and GPS are all part of our everyday experience.
Technology takes place in a social context. It only makes sense to tackle the problems and challenges that AI and big data are causing us. The world of big data and the AI that will power it will grow unchecked if we wait too long to incorporate all cultures into the world of designing and testing AI systems.
The only way to adequately combat this is for the people affected by AI privacy encroachment to collectively think this is an important enough issue to warrant action. I’m not pessimistic, but I don’t think that will ever happen. AI is invisible, and though its impact is tremendous and can be clearly seen, AI itself is an abstract concept. It is not easily digestible in the minds of most Americans. Like Mos Def once said, “The hardheaded gotta feel it, to believe it.” When serious action is considered, we will be long past a total breakdown of privacy norms, and AI data discrimination will be legal.