Summary of the specific criteria that relate to each value considered in our ML assessment framework. These criteria are then translated into specific manifestations in the form of signifiers (orange), process-oriented practices (olive) or quantifiable indicators (magenta).
| Value | Criteria | Manifestations |
|---|---|---|
| Privacy | 1. Consent for data usage ([1], [2], [3]) 2. Data protection ([2],[3],[4]) 3. Control over data / ability to restrict processing ([1], [2]) 4. Right to rectification ([1], [2], [3]) 5. Right to erase the data ([1], [2], [3]) 6. Right of access by data subject, data agency ([1], [5]) |
- Written declaration of consent([1]) - Description of what data is collected([6]) - Description of how data is handled ([6]) - Purpose statement of data collection ([6]) - Statement of how long the data is kept ([6]) - For and submission mechanisms to object data collection and to make complaints ([7]) - Obfuscation of data ([3]) |
| Security | 1. Resilience to attacks: protection of privacy ([8], [9], [10]), vulnerabilities, fallback plans ([2], [3], [11], [12]) 2. Predictability ([2], [3], [13]) 3. Robustness / reliability: prevent manipulation ([3]) |
AGAINST INTEGRITY THREATS ([14]): Training time ([14]). Ex.: -Data sanitization1 ([15], [16]) -Robust learning2 ([15], [17]) Prediction time ([14]): - Model enhancement ([15], [18], [19], [20]) - Adversarial learning3 - Gradient masking4 - Defensive Distillation5 AGAINST PRIVACY THREATS ([14]): Mitigation techniques ([21]): - Restrict prediction vector to top k classes6 ([22]) - Coarsen the precision of the prediction vector7 ([22]) - Increase entropy of the prediction vector8 ([22]) - Use regularization9([22], [23]) Differential privacy mechanisms ([21]): - Differential privacy10 ([24], [25]). Ex.: - Adversarial regularization11 ([21]) - MemGuard12 ([26] |
| Performance | 1. Correctness of predictions ([2], [13], [27]) 2. Memory efficiency ([3]), [27]) 3. Training efficiency ([27]) 4. Energy efficiency ([3]), [27]) 5. Data efficiency ([27]) |
- Accuracy ([28], [29]) - False Positive and Negative rates ([28], [29]) - False Discovery and Omission rates ([28]) - Mean and median error ([29]) - R2 score ([30]) - Precision and recall rates ([29]) - Area under ROC curve (AUC) ([30]) - Estimation of energy consumption through ([31]): - performance counters - simulation - instruction- or architecture-level estimations - real-time estimation - Estimation of GPU memory consumption ([32], [33]) - Wall-clock training time ([34], [35]) |
| Respect for public interest | 1. Desirability of technology ([36], [37], [38]) 2. Benefit to society ([2], [4], [11], [39]) 3. Environmental impact ([3], [40]) |
- Diverse and inclusive forum for discussion ([2], [41]) - Measure of social and environmental impact ([11], [40]), [42]) |
| Fairness | 1. Individual fairness13 ([25], [43], [44], [45]) 2. Demographic parity14 ([9], [25], [43],[44], [45], [46], [47], [48], [49]) 3. Conditional statistical parity15 ([43], [49]) 4. Equality of opportunity16([43], [50], [51]) 5. Equalized odds17 ([43]) 6. Treatment equality18 ([43], [52]) 7. Test fairness19 ([43], [49], [53]) 8. Procedural fairness20 ([43], [45], [54]) |
- Accuracy across groups ([11], [46], [53], [55]) - False positive and negative rates across groups ([43], [53], [55], [56], [57]) - False discovery and omission rates across groups ([28], [57]) - Pinned AUC ([28], [58]) - Debiasing algorithms ([59]) - Election of protected classes based on user considerations ([54]) |
| Non-discrimination | 1. Quality and integrity of data ([2], [9], [11], [60], [61]) 2. Inclusiveness in design ([2], [11], [13]) 3. Accessibility ([2], [3], [11], [27]) |
- Inclusive data generation process ([3], [11], [36], [60]) - Analysis of data for potential biases, data quality assessment ([2],[3], [9], [43], [62]) - Diversity of participant in development process ([2], [3], [63], [64]) - Access to code and technology to all ([2], [3], [11], [27]) |
| Transparency | 1. Interpretability of data and models ([27], [65]) 2. Enabling human oversight of operations ([2], [11]) 3. Accessibility of data and algorithm ([2], [3], [65]) 4. Traceability [11] 5. Reproducibility [27] |
- Description of data generation process ([3], [11], [36], [60], [62], [66]) - Disclosure of origin and properties of models and data ([3], [28], [65]) - Open access to data and algorithms ([2], [3], [27], [65]) - Notification of usage/interaction ([2]) - Regular reporting ([2]) |
| Explainability | 1. Ability to understand AI systems and the decision reached ([4], [13], [27], [39], [65], [67]) 2. Traceability ([11]) 3. Enable evaluation ([2], [11]) |
- Interpretability by design ([44]) - Post-hoc explanations ([44]) |
| Contestability | 1. Enable argumentation / negotiation against a decision ([2], [13], [65], [68], [69], [70], [71], [72]) 2. Citizen empowerment ([13], [68], [71]) |
- Information of who determines and what constitutes a contestable decision and who is accountable ([72]) - Determination of who can contest the decision (subject or representative) ([72]) - Indication of type of review in place ([72]) - Information regarding the contestability workflow ([72]) - Mechanisms for users to ask questions and record disagreements with system behavior ([73], [74]) |
| Human Control | 1. User/collective influence ([27], [69]) 2. Human review of automated decision ([2]) 3. Choice of how and whether to delegate ([2]) |
- Continuous monitoring of system to intervene ([2], [13], [75]) - Establishment levels of human discretion during the use of the system ([8], [13]) - Ability to override the decision made by the system ([13]) |
| Human Agency | 1. Respect for human autonomy ([2], [11], [13]) 2. Power to decide. Ability to make informed autonomous decision ([13], [27]) 3. Ability to opt out of an automated decision ([2], [13]) |
- Give knowledge and tools to comprehend and interact with AI system ([13]) - Opportunity to self-assess the system ([13]) |
It ensures data soundness by identifying abnormal input samples and by removing them ([14]) ↩
It ensures that algorithms are trained on statistically robust datasets, with little sensitivity to outliers ([14]) ↩
Adversarial samples are introduced to the training set ([14]) ↩
Input gradients are modified to enhance model robustness ([14]) ↩
Applicable when the number of classes is very large. Even if the model only outputs the most likely k classes, it will still be useful ([22]) ↩
It consists in rounding the classification probabilities down ([22]) ↩
Modification of the softmax layer (in neural networks) to increase its normalizing temperature ([22]) ↩
Technique to avoid overfitting in ML that penalizes large parameters by adding a regularization factor $\lambda$ to the loss function ([22]) ↩
It prevents any adversary from distinguishing the predictions of a model when its training dataset is used compared to when other dataset is used ([24]) ↩
Membership privacy is modeled as a min-max optimization problem, where a model is trained to achieve minimum loss of accuracy and maximum robustness against the strongest inference attack ([21]) ↩
Noise is added to the confidence vector of the attacker so as to mislead the attacker’s classifier ([26]) ↩
Similar individuals should be treated in a similar way. Diverging definitions state that: two individuals that are similar with respect to a common metric should receive the same outcome (fairness through awareness); or any protected attribute should not be used when making a decision (fairness through unawareness); or the outcome obtained by an individual should be the same if this individual belonged to a counterfactual world or group (counterfactual fairness) ([43]) ↩
The probability of getting a positive outcome should be the same whether the individual belongs to a protected group or not ([43]) ↩
Given a set of factors L, individuals belonging to the protected or unprotected group should have the same probability of getting a positive outcome ([43]) ↩
The probability for a person from class A (positive class) of getting a positive outcome, which should be the same regardless of the group (protected group or not) that the individual belongs to ([43]) ↩
The probability for a person from class A (positive class) of getting a positive outcome and the probability for a person from class B (negative class) of getting a negative outcome should be the same ([43]) ↩
The ratio of false positives and negatives has to be the same for both groups ([43]) ↩
For any probability score S, the probability of correctly belonging to the positive class should be the same for both the protected and unprotected group ([43]) ↩
It deals with the fairness of the decision-making process that leads to the outcome in question ([54]) ↩