Klarity Craft functionality
Extract from klarity capacities, once integrate in documentation will be removes or mark as completed
[ ] <safenai_member> can push and store its artefacts and metrics on KS
[ ] <safenai_member> can push and store its datasets set on KS
[ ] <tech_user> can build the required dataset as specified in associated Blueprint
[ ] <tech_user> can integrate AI component baseline as specified in Blueprint
[ ] <customer_other> can easily import Blueprint specified data inside Klarity
[ ] <business_expert> user can provide the intended purpose and ROI elements (metrics) so that the corresponding artefacts can be created from KC.
[ ] <safenai_member> can locally create and manipulate artefact before delivering configuration that will generate the final ones to the cloud instance of KD
[ ] <tech_user> can store its models set on KS
[ ] <tech_user> can instanciate a blueprint to generate configuration and specific code in KC project
./klarity folder in UC repository root to control KC and repo content
[ ] <safenai_member> can locally develop metrics and artefacts for a specific blueprint
[ ] <tech_user> can locally work and develop a use-case for a specific blueprint
[ ] <tech_user> can use a CI to build and push artefact from WB/KC to KD
[ ] <tech_user> can start metric computation from models/dataset/feedback/spec as specified in Blueprint
[ ] User can launch computation from the workbench over a blueprint instance of use-case
[ ] User can develop, test and debug version on the cloud environment for a specific blueprint
[ ] User will rely on dependancy and computation graph integrated management to regenarate elements impacted by any change and evolution
[ ] <tech_user> can push AI Component simulation metrics and artefact which allow its computatio by WB before being pushed to Karity Dashboard, as required in Blueprint
Data activities
TODO : match with method activities for consistency
Data augmentation
- Diffusion / Gan auto enconding models trained to
- create data with the contextual informations
- Data reallistique pertubation
- Classical data augmentation
- Diffusion / Gan auto enconding models trained to
Data selection
- Which data to use for training / validation / test
- split strategy
- random with no overlap in sequence
- use context to reserve several context for test
- klarity split
- use auxiliary model to clusterize the data
- evaluate distance between exemples
- add these cluster to context, and correlate to existing metadata
- create n subset that will be used during training / evaluation / test process
- create a dummy subset for process check
- split strategy
- Which data to use for training / validation / test
Data annotation
- use SAM for pre anotation
- mixe real generated data
Which type of anotations - basic : - true / false (or anomaly intensity) - can be real time with operator - optional tags can be proposed
- precise anotation (expert) : - anomaly tags (type of anomalies), - precise location of anomaly, object, signal portion, ...) - optional verbatim, - possible mixture of expert ==> Base on type of anotation, who is anotation, level of verification (mixture, quality process) a confidence level will be attach to the anotation - A specific elment located in the data - an object, a signal portion, a general information for the sample==> Compute metrics on user reliability, model reliability, ... possibly multiple anotation for a same sample ==> Propose automatic anotation with a rational / overlay on the screen
How do we define anomalie - A distance (from normal sample) ==> during anotation, ask for a level of anomalie range fro 0-10 instead of yes / no - Kind of anomalies (might be several type in a same sample, so an anomaly vector where average ~= anomaly intensity