Bias in new technologies, a Coded Bias (2021) review

7 min readApr 10, 2023

Disclaimer: The views and opinions expressed in this blog post are solely those of the author, Luiz Felipe Mendes, and do not necessarily reflect the official policy or position of iFood. The content provided is for informational purposes only and should not be construed as legal or professional advice. iFood is not responsible for any actions taken based on the opinions expressed in this blog post.

I originally wrote this blog post in April 2021, before the LLM/GPT frenzy took off. Although the movie itself does not cover this topic, I believe it would be negligent of me to ignore this technology and its impact. Therefore, I have made some changes to the text and added a new section at the end.

Netflix has in its catalog a documentary called Coded Bias , which discusses the use of machine learning algorithms today and how they can have (and often exhibit) biases reflecting our sexist and racist society. The documentary also addresses potential privacy issues arising from technologies like facial recognition when employed by governments and large corporations.

TL;DR

Among documentaries covering this topic, Coded Bias is the best I have seen in terms of content. While there are minor issues, the documentary effectively conveys the need for regulations on emerging AI based technologies, as their unchecked use can cause significant harm to society.

AI is not evil

The technology itself is not inherently evil. It brings numerous benefits to our daily lives. However, given AI’s current impact, it can be used to cause irreparable damage, such as aiding the election of extremist leaders who plunge entire countries into unnecessary suffering.

Will AI will replace humans? Yes and NO. Technology is most effective when combined with human capabilities. Machines can calculate, make decisions based on multiple factors, and perform simple or tedious tasks. Humans, on the other hand, excel at creativity, interaction, and contextual analysis. In the long run, many jobs (e.g., drivers, cashiers, lawyers) may be replaced or require minimal human intervention. In the long run, most jobs (driver, cashier, etc.) are likely to be replaced or, at least, to have a minimal amount of human interference. That is why, as a society, we should start discussing (and also implementing) a universal basic income, in addition to reflecting on the future of work, but that is a subject for another post.

Instead of creating AI that makes legal decisions automatically, we can develop models that analyze past decisions, evaluate similarities and differences, and examine potential biases (e.g., race, gender). It’s crucial to have diverse perspectives on models that impact society, ensuring a comprehensive understanding of the situation and enabling necessary adjustments to minimize negative consequences.

Models and algorithms can serve as tools for learning and reducing biases. The documentary highlights the biases present in models due to skewed data. This is an essential consideration for professionals in the field. Fortunately, we now have methods for detecting these disparities, allowing us to eliminate or significantly mitigate their impact. And given its importance, more researchers should focus on this matter.

For example, human interviewers may have unconscious biases that influence their decisions. We could create a model that analyzes and identifies these implicit biases, enabling correction or acknowledgment.

Source (bias and fairness in models):

Man Is to Computer Programmer as Woman is to Homemaker? De-biasing Word Embeddings.

A Survey on Bias and Fairness in Machine Learning

Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Strengths

● Human perspectives: Creating a documentary about mathematical models is challenging. Featuring a protagonist and others affected by this technology is an effective way to engage viewers and spark interest in the subject.

● Real examples: Discussing technology can be complex for those outside the field, but it’s easy to empathize with a well-regarded teacher who receives a low score from an algorithm. Automated grading isn’t inherently bad, but it should be explainable, allowing individuals to understand areas for improvement. Ultimately, a human should conduct the final analysis, considering the context. Human input is also necessary for refining the model, which will never be perfect. Performance scores should serve as just one tool in a comprehensive assessment process.

● Proposes solutions: Many documentaries highlight the problems of AI and automated decision-making but neglect to suggest solutions. This documentary emphasizes the need for society to discuss and establish laws and regulations governing both the state and corporations. In Brazil, the LGPD (General Data Protection Law) and, in Europe, the GDPR (General Data Protection Regulation) have already begun this process.

Weakness

The documentary is able to focus and convey its main message, which is great, but it has some points for improvement.

HAL 9000 — Odisséia Espacial, de Arthur C. Clarke

● Treating technology as an entity: Portraying algorithms as beings or entities that perform actions like “collecting your data and making decisions that will change your life” can create a disconnect between the average person and the technology. This approach can make AI and machine learning seem distant, ethereal, and exclusively negative.

Black Box

Two areas of AI gaining traction are ethics and transparency. The documentary simplifies machine learning into a black box that makes decisions without explanation. While some algorithms are more challenging to understand, libraries like LIME , shap , and shapash that are able to show which factors were taken into account for a model decision. Discussing explainability is crucial in scenarios where people are directly affected.

There are some important definitions regarding the question of “black box” and interpretability:

● Transparency: the understanding of how an algorithm creates a model, how it works (regardless of the data)

● Global model interpretability: how a model is making predictions after being trained in general, which characteristics are most important for the classification process

● Local model interpretation: understand the reasons and explanations for a specific classification.

To better understand this issue:

● Interpretable Machine Learning

● A Guide for Making Black Box Models Explainable.

Conclusion

AI and technology enable societal growth and evolution, but they can be used for both good and evil. We are at a turning point regarding machine learning technologies. It’s essential to pay close attention to the following points:

● Where this technology is being applied: It makes sense to have algorithms that help choose songs, movies, or food. However, we should be more cautious with models that have a more significant impact, like credit score algorithms. Areas such as surveillance, security, and health should adopt these technologies more slowly to respect people’s security, privacy, and freedom.

● Companies and professionals: All companies and professionals in the field must understand their role in this revolution. We all have responsibilities to create, question, and seek a positive impact on what we create.

This text focuses on the documentary’s content. A cinematographic analysis would differ, but it’s challenging to separate the two, as the presentation is often as important as the content itself. The documentary’s narrative jumps between subjects without clear connections, which can be disorienting. Additionally, the use of a HAL-like character (from 2001: A Space Odyssey) that interacts with the viewer may come across as corny.

GPT3 GPT4 BARD and other LLM (large language models)

Many readers of this post are likely familiar with Large Language Models (LLMs) or have at least encountered GPT-related content. If you’re unfamiliar with LLMs, I recommend checking out this link for an introduction. It’s important to note that the discussion surrounding these models is complex, so I will focus on connecting them to the movie’s discussion, with the possibility of exploring the topic further in a future post.

Large language models, such as GPT-4 (from OpenAI), have been a topic of discussion due to their potential impact on society. These models are trained on vast amounts of data and can generate human-like text, raising concerns about their ethical implications and potential biases.

Similar to the issues raised in the documentary Coded Bias, large language models can also perpetuate biases present in the data they are trained on. This can lead to the generation of biased, sexist, or racist content, which can have negative consequences for society.

Moreover, the use of large language models in various applications, such as content generation, translation, and summarization, can influence public opinion and decision-making processes. This highlights the need for regulations and guidelines to ensure that these models are used responsibly and ethically.