INSUBCONTINENT EXCLUSIVE:

Vince Lynch Contributor Vince Lynch is CEO of IV.AI, an artificial intelligence company that teaches

machines how to understand human language so companies can better engage, understand and serve their customers. At this moment in

history it impossible not to see the problems that arise from humanbias

Now magnify that by compute and you start to get a sense for just how dangerous humanbiasvia machine learning can be

The damage can be twofold: Influence

If theAIsaid so it must be true… people trust outputs ofAI,so if humanbiasis missed in the training it could compound the problem by

infecting more people; Automation

SometimesAImodels are plugged into a programmatic function, which could lead to the automation ofbias. But there is potentially a silver

machine-learned lining

BecauseAIcan help expose truth inside messy data sets, it possible for algorithms to help us better understandbiaswe haven&t already

isolated, and spot ethically questionable ripples in human data so we can check ourselves

Exposing human data to algorithms exposesbias,and if we are considering the outputs rationally, we can use machine learning aptitude for

spotting anomalies. But the machines can&t do it on their own

Even unsupervised learning is semi-supervised, as it requires data scientists to choose the training data that goes into the models

If a human is the chooser,biascan be present

How the heck do we tackle such abiasbeast We will attempt to pick it apart. The landscape of ethical concerns withAI Bad examples abound

Consider thefindingfrom Carnegie Mellon that showed that women were shown significantly fewer online ads for high-paying jobs than men were

Orrecall the sad case of Tay, Microsoft teen slang Twitter bot that had to be taken down after producing racist posts. In the near future,

such mistakes could result in hefty fines or compliance investigation, a conversation that alreadyoccurring in the U.K

parliament

All mathematicians and machine learning engineersshould considerbiasto some degree, but that degree varies from instance to instance

A small company with limited resources will often be forgiven for accidentalbiasas long as the algorithmic vulnerability is fixed quickly; a

Fortune 500 company, which presumably has the resources to ensure an unbiased algorithm, will be held to a tighter standard. Of course, an

algorithm that recommends novelty T-shirts does not need nearly as much oversight as an algorithm that decides what dose of radiation to

give to a cancer patient

It these high-stakes decisions that will become the most pronounced when legal liability enters the discussion. It important for builders

and business leaders to establish a process for monitoring the ethical behavior of theirAIsystems. Three keys to managing biaswhen building

AI There are signs of existing self-correction in theAIindustry: Researchers arelooking at waysto reducebiasand strengthen ethics in

rule-based artificial systems by taking human biases into account, for example. These are good practices to follow; it important to be

thinking proactively about ethics regardless of the regulatory environment

Let take a look at several points to keep in mind as you work on yourAI. 1.Choose the right learning model for the problem. There a reason

allAImodels are unique: Each problem requires a different solution and provides varying data resources

There no single model to follow that will avoidbias, but there are parameters that can inform your team as it building. For example,

supervised and unsupervised learning models have their respective pros and cons

Unsupervised models that cluster or do dimensional reduction can learnbiasfrom their data set

If belonging to group A highly correlates to behavior B, the model can mix up the two

And while supervised models allow for more control overbiasin data selection, that control can introduce humanbiasinto the

process. It better to find and fix vulnerabilities now than to have regulators find them later on. Non-biasthrough

ignorance — excluding sensitive information from the model — may seem like a workable solution, but it still has vulnerabilities

In college admissions, sorting applicants by ACT scores is standard, but taking their ZIP code into account might seem discriminatory

But because test scores might be affected by the preparatory resources in a given area, including the ZIP code in the model could actually

decreasebias. You have to require your data scientists to identify the best model for a given situation

Sit down and talk them through the different strategies they can take when building a model

Troubleshoot ideas before committing to them

It better to find and fix vulnerabilities now — even if it means taking longer — than to have regulators find them later on. 2

Choose a representative training data set. Your data scientists may do much of the leg work, but it up to everyone participating in

anAIproject to actively guard againstbiasin data selection

There a fine line you have to walk

Making sure the training data is diverse and includes different groups is essential, but segmentation in the model can be problematic unless

the real data is similarly segmented. It inadvisable — both computationally and in terms of public relations — to have different models

for different groups.When there is insufficient data for one group, you could possibly use weighting to increase its importance in training,

but this should be done with extreme caution

It can lead to unexpected new biases. For example, if you have only 40 people from Cincinnati in a data set and you try to force the model

to consider their trends, you might need to use a large weight multiplier

Your model would then have a higher risk of picking up on random noise as trends — you could end up with results like &people named Brian

have criminal histories.& This is why you need to be careful with weights, especially large ones. 3

Monitor performance using real data. No company is knowingly creating biasedAI, of course — all these discriminatory models probably

worked as expected in controlled environments

Unfortunately, regulators (and the public) don&t typically take best intentions into account when assigning liability for ethical violations

That why you should be simulating real-world applications as much as possible when building algorithms. It unwise, for example, to use test

groups on algorithms already in production

Instead, run your statistical methods against real data whenever possible

Ask the data team to check simple test questions like &Do tall people default onAI-approved loans more than short people& If they do,

determine why. When you&re examining data, you could be looking fortwo types of equality: equality of outcome and equality of opportunity

If you&re working onAIfor approving loans, result equality would mean that people from all cities get loans at the same rates; opportunity

equality would mean that people whowould have returned the loan if given the chance are given the same rates regardless of city.Without the

latter, the former could still hide if one city has a culture that makes defaulting on loans common. Result equality is easier to prove, but

it also means you&ll knowingly accept potentially skewed data.While it harder to prove opportunity equality, it is at least valid morally

It often practically impossible to ensure both types of equality, but oversight and real-world testing of your models should give you the

best shot. Eventually, these ethicalAIprinciples will be enforced by legal penalties

IfNew York City early attemptsat regulating algorithms are any indication, those laws will likely involve government access to the

development process, as well as stringent monitoring of the real-world consequences ofAI

The good news is that by using proper modeling principles,biascan be greatly reduced or eliminated, and those working onAIcan help expose

accepted biases,create a more ethical understanding of tricky problemsand stay on the right side of the law — whatever it ends up being.

Three ways to avoid bias in machine learning