Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
I interpretability
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Avishek Anand
  • interpretability
  • Wiki
  • Tutorials and Introductory remarks

Last edited by Avishek Anand Apr 16, 2019
Page history

Tutorials and Introductory remarks

The Challenge of Crafting Intelligible Intelligence: In this paper Bansal and Weld argue that to build trust in interpretability systems it should have an interactive nature (almost conversational style) with its stake holder. They state

"The key challenge for designing intelligible AI is communicating a complex computational process to a human. This requires interdisciplinary skills, including HCI as well as AI and machine learning expertise."

Some notions addressed in the paper:

  • one suggested criterion is human simulatability (Lipton '16): can a human user easily predict the model’s output for a given input? By this definition, sparse linear models are more interpretable than dense or non-linear ones.

What Errors do ML systems show ?

  • AI may have the Wrong Objective
  • AI may be Using Inadequate Features: correlated features in the data
  • Distributional Drift
  • Facilitating User Control: Many AI systems induce user pref- erences from their actions. For example, adaptive news feeds predict which stories are likely most interesting to a user. As robots become more common and enter the home, preference learning will become ever more common. If users understand why the AI performed an undesired action, they can better issue instructions that will lead to improved future behavior.
  • User Acceptance: Even if they don’t seek to change system behavior, users have been shown to be happier with and more likely to accept algorithmic decisions if they are accompanied by an explanation [18]. After being told that they should have their kidney removed, it’s natural for a patient to ask the doctor why — even if they don’t fully understand the answer.
  • Improving Human Insights:
  • Legal Imperatives

Tutorial at AAAI 19

Web page and slides

Explanation and Persuasion Theory

Taken from : Progressive Disclosure Empirically Motivated Approaches to Designing Effective Transparency

People interact with computers and intelligent systems in ways that mirror how they interact with other people [62,70]. Given that transparency is essentially an explanation of why a model made a given prediction, we can turn to fields such as psychology and sociology for guidance about operationalizing explanations. These fields have a long history of studying explanation. One approach is to model causal explanation as a form of conversation which is governed by common-sense conversational rules [36] such as Grice’s maxims [30]. In addition, when explanation is needed and a communication breakdown occurs this is remedied by a phenomenon known as conversational repair. Conversational repair is interactional, participants in the conversation collaborate to achieve mutual understanding; this often happens in a turn-by-turn structure with repeated questions and clarifications [73]. These theories would indicate that we should operationalize transparency in ways that fit human communication and repair strategies.

Clone repository
  • Concept based Explanations
  • Interpretability By Design
  • Limitations of Interpretability
  • Neurips 2019 Interpretability Roundup
  • On the (In)fidelity and Sensitivity of Explanations
  • Re inforcement Learning for NLP and Text
  • Tutorials and Introductory remarks
  • Visualizing and Measuring the Geometry of BERT
  • a benchmark for interpretability methods in deep neural networks
  • bam
  • explanations can be manipulated and geometry is to blame
  • Home