af.model.algorithms package

Submodules

af.model.algorithms.AfManager module

class af.model.algorithms.AfManager.AfManager[source]

This class is to be used as a proxy against external applications that want to integrate with the AF

get_algorithm_instance(data_config, algorithm_name, algorithm_arguments, optimized_processing)[source]

Create an instance of a certain algorithm and return it. It receives all the arguments necessary for the instance creation.

Parameters:
  • data_config (class:af.model.DataConfig instance) – Data configuration object
  • algorithm_name (string) – Name of the algorithm intended to create the instance
  • algorithm_arguments (list) – List of particular arguments the algorithm needs
  • optimized_processing (bool) – Indicates if the algorithm should try to optimize the processing while transforming
Return type:

class:af.model.algorithms.BaseAlgorithm instance

get_algorithms(privacy_model)[source]

Return a list of all the available algoritms.

Return type:List of algorithm names
get_algoritm_parameters(algorithm_selected)[source]

Return all the particular parameters an algorithm needs to be used. The common arguments are: self, data_config and optimized_processing

Parameters:algorithm_selected (string) – Name of the algorithm
Return type:List of arguments
get_all_algorithms()[source]

Retrieve all the algorithms that are subclass of BaseAlgorithm

Return type:List of algoritm classes
get_all_subclasses(cls)[source]

Given a class, retrieve all the subclasses (From root to leaves)

Parameters:cls – A class
Return type:list of all subclasses of cls
static load_modules()[source]

Load every module contained inside the algorithms module directory.

af.model.algorithms.BaseAlgorithm module

class af.model.algorithms.BaseAlgorithm.BaseAlgorithm(data_config, optimized_processing=False)[source]

Bases: object

Base class for all algorithms that are to be implemented. It details all the necessary methods that are going to be used to achieve a successful anonymization. Some are already implemented and shouldn’t be changed, other were left as not implemented with the purpose fo each algorithm to design it based on its own necessities

ALGORITHM_NAME = None
PRIVACY_MODEL = None
anonymize(*args, **kw)
insert_additional_information(*args, **kw)
obtain_qi_most_frequently(*args, **kw)
obtain_quasi_identifier_frequencies(*args, **kw)
on_post_process()[source]

All the steps that are to be taken after the transformation process, should be put in here. The method is called after the process method during the anonymization.

on_pre_process()[source]

All the steps that are preconditions to be executed for the anonymization process, should be put in here. The method is called prior to the process method during the anonymization. The BaseAlgorithm class calls the minimum intended to validate the conditions: validates the arguments, and calls the PreProcessingStage preprocess method.

process()[source]

Abstract method. Inside of it all the directives to be used to complete the anonymization process must be put.

remove_rows(*args, **kw)
save_anonymization_info_on_data_config()[source]

After the anonymization process, the data configuration instance has to be populated with information about the new db location, the anonymized table, and the metrics table (which contains additional information about the transformation process)

update_qi_values(*args, **kw)
validate_anonymize_conditions()[source]

Validates the anonymization conditions. Abstract Method.

validate_arguments()[source]

Validates all the arguments that are used during the anonymization process

Return type:Boolean True/False

af.model.algorithms.BaseKAlgorithm module

class af.model.algorithms.BaseKAlgorithm.BaseKAlgorithm(data_config, k, optimized_processing=False)[source]

Bases: af.model.algorithms.BaseAlgorithm.BaseAlgorithm

Class extending from the BaseAlgorithm class. It is intended to be a model for those algoritm implementations that have the k-anonymization as a model.

validate_anonymize_conditions(*args, **kw)
validate_arguments()[source]

Validates all the arguments that are used during the anonymization process. If an error occurs, it throws an exception

Return type:True if arguments are valid.

af.model.algorithms.Datafly module

class af.model.algorithms.Datafly.Datafly(data_config, k, optimized_processing=False)[source]

Bases: af.model.algorithms.BaseKAlgorithm.BaseKAlgorithm

Datafly Algorithm implementation

ALGORITHM_NAME = 'Datafly'
PRIVACY_MODEL = 'k'
on_post_process()[source]

After anonymizing, rename table to the common Anonymization Table Name defined.

process()[source]

The main core algorithm to anonymize using the Datafly implementation

af.model.algorithms.GeneralizationLatticeGraph module

class af.model.algorithms.GeneralizationLatticeGraph.GLGNode(subset, glg_lvl, qi_keys, marked=False)[source]

This class is intended for the modelization of Node inside a GeneralizationLatticeGraph. A GLGNode contains information about the level in which it is placed, the subset dimension values of it, and if it is marked as a valid node to be used for anonymization or not.

class af.model.algorithms.GeneralizationLatticeGraph.GeneralizationLatticeGraph(qi_info)[source]

This class is used to create a graph that relates all the dimension combination nodes within each other. In particular, it is used to know, given a certain GLGNode (That is basically a subset of combinations of all the qi attributes values), which are the GLGNodes connected to it.

create_bfs_structure()[source]

Creates the BFS structure to be used later during the anonymization process. It maintains a cache list of all the leaf nodes that appeared.

get_lvl_subnodes(lvl)[source]

Return all the GLGNodes that are in that specific level.

Parameters:lvl (int) – Level inteded to analyze
Return type:List of GLGNodes
get_marked_nodes(marked=True)[source]

Return a list of all those GLGNodes inside the GeneralizationLatticeGraph that are marked with a certain value.

Parameters:marked (bool) – True or False (Default to True)
Return type:List of nodes that are marked as required.
get_upper_level_nodes(node, lvl)[source]

Return all the nodes that are above a certain GLGNode and a certain lvl.

Parameters:
  • node – GLGNode intended to be used to find it’s upper nodes
  • lvl (int) – Level in which the node is located
Return type:

List of GLGNodes that are parents of the node

mark_valid_subnode(node)[source]

Given a node, mark it as a valid option to be used during the anonymization process

Parameters:node – GLGNode to be marked as valid

af.model.algorithms.IncognitoK module

class af.model.algorithms.IncognitoK.IncognitoK(data_config, k=2, optimized_processing=False)[source]

Bases: af.model.algorithms.BaseKAlgorithm.BaseKAlgorithm

Incognito-K Algorithm implementation. It is mainly based on heavy join queries across tables that contains dimensions of qi attributes and different values

ALGORITHM_NAME = 'Incognito K'
PRIVACY_MODEL = 'k'
additional_anonymization_information()[source]

Add particular anonymization information of the process to the dictionary of additional information

checks_model_conditions(node)[source]

Call every method that contains a model validation. For this particular case, only the k condition. It can be re implemented for algorithms that inherit from this class

Parameters:node – GLGNode to use
Return type:Boolean depending if all the conditions have been met or not.
choose_generalization(*args, **kw)
create_check_k_condition_query(*args, **kw)
create_condition_queries(*args, **kw)
create_table_hierarchies_star_schema(*args, **kw)
create_walking_bfs_hierarchy_levels_tree(*args, **kw)
dump_anonymized_data(*args, **kw)
insert_values_on_dimension_tables(*args, **kw)
normal_filter(list_to_filter, filter_method)[source]

Auxiliary method that will filter a list given a filter method, and returns the filtered value

Parameters:
  • list_to_filter (list) – List to be filtered
  • filter_method – A method that receives 2 subsets and decide how to filter based on its own conditions
Return type:

List filtered by the filter_method

on_post_process()[source]

After the anonymization process has ended, save particular information of it

process(*args, **kw)
retrieve_possible_generalizations(*args, **kw)
subnode_checks_k_condition(node)[source]

Method that will query the table given a certain node to check if it’s subset validates the K-model condition_query

Parameters:node – GLGNode to use
Return type:True if the k condition has been met, False otherwise
weighted_filter(list_to_filter)[source]

Specific filter that will try to reduce a certain list of filters, using the subset weights as key filter

Parameters:list_to_filter (list) – List intended to be filtered
Return type:A filtered by attribute weight list

af.model.algorithms.IncognitoL module

class af.model.algorithms.IncognitoL.IncognitoL(data_config, k=3, l=2, optimized_processing=False)[source]

Bases: af.model.algorithms.IncognitoK.IncognitoK

Incognito-L Algorithm implementation. Extends the IncognitoK algorithm implementation

ALGORITHM_NAME = 'Incognito L'
PRIVACY_MODEL = 'l'
checks_model_conditions(node)[source]

Check all the model conditions to determine if the node given accomplishes the requirements necessary to be considered a possible generalization

Parameters:node – GLGNode instance to be used
Return type:Boolean indicating if the node can be used a generalization for the anonymization process
create_check_l_condition_query(*args, **kw)
create_condition_queries(*args, **kw)
load_sensitive_attribute()[source]

The L-diversity model is based on a sensitive attribute. Select from the data configuration, which is the sensitive attribute to be taken into account.

on_post_process()[source]

After the anonymization process has ended, save particular information of it

subnode_checks_l_condition(node)[source]

Specific method that checks the L condition given a node

Parameters:node – GLGNode instance to be used
Return type:Boolean indicating if the L condition has been met or not.
validate_arguments()[source]

Validates the general arguments, like the data config and the k value (Using the parent classes), and also validates the L value

Module contents