Now You See Me (CME): Concept-based Model Extraction
Problem — Deep Neural Network models are black boxes, which cannot be interpreted directly. As a result — it is difficult to build trust in such models. Existing methods, such as Concept Bottleneck Models, make such models more interpretable, but require a high annotation cost for annotating underlying concepts
In recent years, the realm of Explainable Artificial Intelligence (XAI) [1] has witnessed a surging interest in Concept Bottleneck Model (CBM) approaches [2]. These methods introduce an innovative model architecture, in which input images are processed in two distinct phases: concept encoding and concept processing.
During concept encoding, concept information is extracted from the high-dimensional input data. Subsequently, in the concept processing phase, this extracted concept information is used to generate the desired output task label. A salient feature of CBMs is their reliance on a semantically-meaningful concept representation, serving as an intermediate, interpretable representation for downstream task predictions, as shown below:
0 Comments