API Reference

Submodules

datafaker.create module

Functions and classes to create and populate the target database.

class datafaker.create.StoryIterator(stories: Iterable[tuple[str, Generator[Tuple[str, dict[str, Any]], dict[str, Any], NoneType]]], table_dict: Mapping[str, sqlalchemy.schema.Table], table_generator_dict: Mapping[str, TableGenerator], dst_conn: sqlalchemy.Connection)

Bases: object

Iterates through all the rows produced by all the stories.

has_table(table_name: str) → bool: Check if we have a row for table table_name.

insert(metadata: sqlalchemy.schema.MetaData) → None

Put the row in the table.

Call this after __init__ or next, and after checking that is_ended returns False.

is_ended() → bool

Check if we have another row to process.

If so, insert() can be called.

next() → None: Advance to the next row.

table_name() → str | None

Get the name of the current table.

Returns:: The table name, or None if there are no more stories to process.

datafaker.create.create_db_data(sorted_tables: Sequence[sqlalchemy.schema.Table], df_module: module, num_passes: int, metadata: sqlalchemy.schema.MetaData) → Counter[str]: Connect to a database and populate it with data.

datafaker.create.create_db_data_into(sorted_tables: Sequence[sqlalchemy.schema.Table], df_module: module, num_passes: int, db_dsn: str, schema_name: str | None, metadata: sqlalchemy.schema.MetaData) → Counter[str]

Populate the database.

Parameters:

sorted_tables – The table names to populate, sorted so that foreign keys’ targets are populated before the foreign keys themselves.
table_generator_dict – A mapping of table names to the generators used to make data for them.
story_generator_list – A list of story generators to be run after the table generators on each pass.
num_passes – Number of passes to perform.
db_dsn – Connection string for the destination database.
schema_name – Destination schema name.

datafaker.create.create_db_tables(metadata: sqlalchemy.schema.MetaData) → None: Create tables described by the sqlalchemy metadata object.

datafaker.create.create_db_vocab(metadata: sqlalchemy.schema.MetaData, meta_dict: dict[str, Any], config: Mapping, base_path: Path = PosixPath('.')) → list[str]

Load vocabulary tables from files.

Parameters:

metadata – The schema of the database
meta_dict – The simple description of the schema from –orm-file
config – The configuration from –config-file

Returns:

List of table names loaded.

datafaker.create.populate(dst_conn: sqlalchemy.Connection, tables: Sequence[sqlalchemy.schema.Table], table_generator_dict: Mapping[str, TableGenerator], story_generator_list: Sequence[Mapping[str, Any]], metadata: sqlalchemy.schema.MetaData) → Counter[str]: Populate a database schema with synthetic data.

datafaker.create.remove_on_delete_cascade(element: sqlalchemy.schema.CreateTable, compiler: Any, **kw: Any) → str

Intercede in compilation for column creation, removing ON DELETE CASCADE.

DuckDB does not understand cascades, and we don’t care about that in datafaker. Ideally duckdb_engine would remove this for us. :param element: The CreateTable being executed. :param compiler: Actually a DDLCompiler, but that type is not exported. :param kw: Further arguments. :return: Corrected SQL.

datafaker.create.remove_serial(element: sqlalchemy.schema.CreateColumn, compiler: Any, **kw: Any) → str

Intercede in compilation for column creation, removing PostgreSQL’s SERIAL.

DuckDB does not understand SERIAL, and we don’t care about autoincrementing in datafaker. Ideally duckdb_engine would remove this for us, or DuckDB would implement SERIAL :param element: The CreateColumn being executed. :param compiler: Actually a DDLCompiler, but that type is not exported. :param kw: Further arguments. :return: Corrected SQL.

datafaker.main module

Entrypoint for the datafaker package.

class datafaker.main.TableType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: str, Enum

Types of tables for the list-tables command.

ALL = 'all'

GENERATED = 'generated'

VOCAB = 'vocab'

datafaker.main.configure_generators(config_file: str = typer.Option, orm_file: str = typer.Option, spec: Path = typer.Option) → None: Interactively set generators for column data.

datafaker.main.configure_missing(config_file: str = typer.Option, orm_file: str = typer.Option) → None: Interactively set the missingness of the generated data.

datafaker.main.configure_tables(config_file: str = typer.Option, orm_file: str = typer.Option) → None: Interactively set tables to ignored, vocabulary or primary private.

datafaker.main.convert_table_names_to_tables(table_names: list[str], metadata: sqlalchemy.MetaData) → list[sqlalchemy.Table]

Convert a list of table names to SQLAlchemy Tables.

Parameters:

table_names – List of names of tables
metadata – Metadata of the database

Returns:

List of tables with names matching table_names

datafaker.main.create_data(orm_file: str = typer.Option, df_file: str = typer.Option, config_file: Optional[str] = typer.Option, num_passes: int = typer.Option) → None

Populate the schema in the target directory with synthetic data.

This CLI command generates synthetic data for Python table structures, and inserts these rows into a destination schema.

Also takes as input object relational model as represented by file containing Python classes and its attributes.

Takes as input datafaker output as represented by Python classes, its attributes and methods for generating values for those attributes.

Final input is the number of rows required.

Example

$ datafaker create-data

datafaker.main.create_generators(orm_file: str = typer.Option, df_file: str = typer.Option, config_file: str = typer.Option, stats_file: Optional[str] = typer.Option, force: bool = typer.Option) → None

Make a datafaker file of generator classes.

This CLI command takes an object relation model output by sqlcodegen and returns a set of synthetic data generators for each attribute

Example

$ datafaker create-generators

datafaker.main.create_tables(orm_file: str = typer.Option, config_file: Optional[str] = typer.Option) → None

Create schema from the ORM YAML file.

This CLI command creates the destination schema using object relational model declared as Python tables.

Example

$ datafaker create-tables

datafaker.main.create_vocab(orm_file: str = typer.Option, config_file: str = typer.Option) → None

Import vocabulary data into the target database.

Example

$ datafaker create-vocab

datafaker.main.dump_data(config_file: Optional[str] = typer.Option, orm_file: str = typer.Option, table: list[str] = typer.Option, output: str | None = typer.Option, parquet: bool = typer.Option) → None: Dump a whole table as a CSV file (or to the console) from the destination database.

datafaker.main.list_tables(orm_file: str = typer.Option, config_file: Optional[str] = typer.Option, tables: TableType = typer.Option) → None: List the names of tables described in the metadata file.

datafaker.main.load_metadata(orm_file_name: str, config: dict | None = None) → sqlalchemy.MetaData

Load metadata from orm.yaml.

Parameters:

orm_file_name – orm.yaml or alternative name to load metadata from.
config – Used to exclude tables that are marked as ignore: true.

Returns:

SQLAlchemy MetaData object representing the database described by the loaded file.

datafaker.main.load_metadata_config(orm_file_name: str, config: dict | None = None) → dict[str, Any]

Load the orm.yaml file, returning a dict representation.

Parameters:

orm_file_name – The name of the file to load.
config – The config.yaml file object. Ignored tables will be excluded from the output.

Returns:

A dict representing the orm.yaml file, with the tables the config says to ignore removed.

datafaker.main.load_metadata_for_output(orm_file_name: str, config: dict | None = None) → sqlalchemy.MetaData: Load metadata excluding any foreign keys pointing to ignored tables.

datafaker.main.main(verbose: bool = typer.Option) → None: Set the global parameters.

datafaker.main.make_stats(config_file: Optional[str] = typer.Option, stats_file: str = typer.Option, force: bool = typer.Option) → None

Compute summary statistics from the source database.

Writes the statistics to a YAML file.

Example

$ datafaker make_stats –config-file=example_config.yaml

datafaker.main.make_tables(orm_file: Path = typer.Option, force: bool = typer.Option, parquet_dir: Optional[Path] = typer.Option) → None

Make a YAML file representing the tables in the schema.

Example

$ datafaker make_tables

datafaker.main.make_vocab(orm_file: str = typer.Option, config_file: Optional[str] = typer.Option, force: bool = typer.Option, compress: bool = typer.Option, only: list[str] = typer.Option) → None

Make files of vocabulary tables.

Each table marked in the configuration file as “vocabulary_table: true”

Example

$ datafaker make-vocab –config-file config.yml

datafaker.main.remove_data(orm_file: str = typer.Option, config_file: Optional[str] = typer.Option, yes: bool = typer.Option) → None: Truncate non-vocabulary tables in the destination schema.

datafaker.main.remove_tables(orm_file: str = typer.Option, config_file: str = typer.Option, all: bool = typer.Option, yes: bool = typer.Option) → None

Drop all tables in the destination schema.

Does not drop the schema itself.

datafaker.main.remove_vocab(orm_file: str = typer.Option, config_file: Optional[str] = typer.Option, yes: bool = typer.Option) → None: Truncate vocabulary tables in the destination schema.

datafaker.main.validate_config(config_file: Path = typer.Argument) → None: Validate the format of a config file.

datafaker.main.version() → None: Display version information.

datafaker.make module

Functions to make a module of generator classes.

class datafaker.make.ColumnChoice(function_name: str, argument_values: list[str])

Bases: object

Choose columns based on a random number in [0,1).

argument_values: list[str]

function_name: str

class datafaker.make.DbConnection(engine: Union[sqlalchemy.Engine, sqlalchemy.ext.asyncio.AsyncEngine])

Bases: object

A connection to a database.

async execute_query(query_block: Mapping[str, Any]) → Any: Execute query in query_block.

async execute_raw_query(query: sqlalchemy.sql.Executable) → sqlalchemy.CursorResult: Execute the query on the owned connection.

async table_row_count(table_name: str) → int: Count the number of rows in the named table.

class datafaker.make.FunctionCall(function_name: str, argument_values: list[str])

Bases: object

Contains the df.py content related function calls.

argument_values: list[str]

function_name: str

class datafaker.make.GeneratorInfo(generator: ~typing.Union[str, ~typing.Callable[[sqlalchemy.schema.Column], tuple[str, dict[str, str]]]], summary_query: str | None = None, arg_types: dict[str, typing.Callable] = <factory>, numeric: bool = False, choice: bool = False)

Bases: object

Description of a generator.

arg_types: dict[str, Callable]

choice: bool = False

generator: Union[str, Callable[[sqlalchemy.schema.Column], tuple[str, dict[str, str]]]]

numeric: bool = False

summary_query: str | None = None

class datafaker.make.RowGeneratorInfo(variable_names: list[str], function_call: FunctionCall, primary_key: bool = False)

Bases: object

Contains the df.py content related to row generators of a table.

function_call: FunctionCall

primary_key: bool = False

variable_names: list[str]

class datafaker.make.StoryGeneratorInfo(wrapper_name: str, function_call: FunctionCall, num_stories_per_pass: int)

Bases: object

Contains the df.py content related to story generators.

function_call: FunctionCall

num_stories_per_pass: int

wrapper_name: str

class datafaker.make.TableGeneratorInfo(class_name: str, table_name: str, nonnull_columns: set[str], column_choices: list[datafaker.make.ColumnChoice], rows_per_pass: int, row_gens: list[datafaker.make.RowGeneratorInfo] = <factory>, unique_constraints: ~typing.Sequence[~typing.Union[sqlalchemy.UniqueConstraint, ~datafaker.make._PrimaryConstraint]] = <factory>)

Bases: object

Contains the df.py content related to regular tables.

class_name: str

column_choices: list[datafaker.make.ColumnChoice]

nonnull_columns: set[str]

row_gens: list[datafaker.make.RowGeneratorInfo]

rows_per_pass: int

table_name: str

unique_constraints: Sequence[Union[sqlalchemy.UniqueConstraint, _PrimaryConstraint]]

class datafaker.make.VocabularyTableGeneratorInfo(variable_name: str, table_name: str, dictionary_entry: str)

Bases: object

Contains the df.py content related to vocabulary tables.

dictionary_entry: str

table_name: str

variable_name: str

datafaker.make.fix_type(value: Any) → Any: Make this value suitable for yaml output.

datafaker.make.fix_types(dics: list[dict]) → list[dict]: Make all the items in this list suitable for yaml output.

datafaker.make.generate_df_content(template_context: Mapping[str, Any]) → str: Generate the content of the df.py file as a string.

datafaker.make.get_result_mappings(info: GeneratorInfo, results: sqlalchemy.CursorResult) → dict[str, Any] | None

Get a mapping from the results of a database query.

Returns:: A Python dictionary converted according to the GeneratorInfo provided.

datafaker.make.make_column_choices(table_config: Mapping[str, Any]) → list[datafaker.make.ColumnChoice]

Convert missingness_generators from config.yaml into functions to call.

Parameters:: table_config – The tables part of config.yaml.
Returns:: A list of ColumnChoice objects; that is, descriptions of functions and their arguments to call to reveal a list of columns that should have values generated for them.

async datafaker.make.make_src_stats(dsn: str, config: Mapping, schema_name: Optional[str] = None) → dict[str, dict[str, Any]]

Run the src-stats queries specified by the configuration.

Query the src database with the queries in the src-stats block of the config dictionary, using the differential privacy parameters set in the smartnoise-sql block of config. Record the results in a dictionary and return it.

Parameters:

dsn – database connection string
config – a dictionary with the necessary configuration
schema_name – name of the database schema

Returns:

The dictionary of src-stats.

async datafaker.make.make_src_stats_connection(config: Mapping, db_conn: DbConnection) → dict[str, dict[str, Any]]

Make the src-stats.yaml file given the database connection to read from.

Parameters:

config – configuration from config.yaml.
db_conn – Source database connection.

datafaker.make.make_table_generators(metadata: sqlalchemy.MetaData, config: Mapping, orm_filename: str, config_filename: str, src_stats_filename: Optional[str]) → str

Create datafaker generator classes.

The orm and vocabulary YAML files must already have been generated (by make-tables and make-vocab).

Parameters:

metadata – database ORM
config – Configuration to control the generator creation.
orm_filename – “orm.yaml” file path so that the generator file can load the MetaData object
config_filename – “config.yaml” file path so that the generator file can load the MetaData object
src_stats_filename – A filename for where to read src stats from. Optional, if None this feature will be skipped
overwrite_files – Whether to overwrite pre-existing vocabulary files

Returns:

A string that is a valid Python module, once written to file.

datafaker.make.make_tables_file(db_dsn: str, schema_name: Optional[str], parquet_dir: Optional[Path] = None) → str: Construct the YAML file representing the schema.

datafaker.make.make_vocabulary_tables(metadata: sqlalchemy.MetaData, config: Mapping, overwrite_files: bool, compress: bool, table_names: set[str] | None = None) → None: Extract the data from the source database for each vocabulary table.

datafaker.providers module

This module contains Mimesis Provider sub-classes.

Bases: BaseDataProvider

A Mimesis provider of binary data.

class Meta

Bases: object

Meta-class for BytesProvider settings.

name = 'bytes_provider'

bytes() → bytes: Return a UTF-8 encoded sentence.

class datafaker.providers.ColumnValueProvider(*, seed: int | None = None, **kwargs: Any)

Bases: BaseProvider

A Mimesis provider of random values from the source database.

class Meta

Bases: object

Meta-class for ColumnValueProvider settings.

name = 'column_value_provider'

static column_value(db_connection: sqlalchemy.Connection, orm_class: Any, column_name: str) → Any: Return a random value from the column specified.

increment(db_connection: sqlalchemy.Connection, column: sqlalchemy.Column) → int: Return incrementing value for the column specified.

class datafaker.providers.DistributionProvider(*, seed: int | None = None, **kwargs: Any)

Bases: BaseProvider

A Mimesis provider for various distributions.

class Meta

Bases: object

Meta-class for various distributions.

name = 'distribution_provider'

PERMITTED_SUBGENS = {'constant', 'grouped_multivariate_lognormal', 'grouped_multivariate_normal', 'multivariate_lognormal', 'multivariate_normal', 'weighted_choice', 'with_constants_at'}

alternatives(alternative_configs: list[dict[str, Any]], counts: list[dict[str, int]] | None) → Any

Pick between other generators.

Parameters:

alternative_configs – List of alternative generators. Each alternative has the following keys: “count” – a weight for how often to use this alternative; “name” – which generator for this partition, for example “composite”; “params” – the parameters for this alternative.
counts – A list of weights for each alternative. If None, the “count” value of each alternative is used. Each count is a dict with a “count” key.

Returns:

list of values

choice(a: list[collections.abc.Mapping[str, T]]) → Optional[T]

Choose a value with equal probability.

Parameters:: a – The list of values to output. Each element is a mapping with a key value and the key is the value to return.
Returns:: The chosen value.

choice_direct(a: list[T]) → T

Choose a value with equal probability.

Parameters:: a – The list of values to output.
Returns:: The chosen value.

constant(value: T) → T: Return the same value always.

grouped_multivariate_lognormal(covs: list[dict[str, Any]]) → list[Any]: Produce a list of values pulled from a set of multivariate distributions.

grouped_multivariate_normal(covs: list[dict[str, Any]]) → list[Any]: Produce a list of values pulled from a set of multivariate distributions.

lognormal(logmean: float, logsd: float) → float

Choose a value according to a lognormal distribution.

Parameters:

logmean – The mean of the logs of the output values.
logsd – The standard deviation of the logs of the output values.

Returns:

The output value.

multivariate_lognormal(cov: dict[str, Any]) → list[float]

Produce a list of values pulled from a multivariate distribution.

Parameters:: cov – A dict with various keys: rank is the number of output values, m0, m1, … are the means of the distributions (rank of them). c0_0, c0_1, c1_1, … are the covariates, cN_M is the covariate of the Nth and Mth varaibles, with 0 <= N <= M < rank. These are all the means and covariants of the logs of the data.
Returns:: list of rank floating point values

multivariate_normal(cov: dict[str, Any]) → list[float]

Produce a list of values pulled from a multivariate distribution.

Parameters:: cov – A dict with various keys: rank is the number of output values, m0, m1, … are the means of the distributions (rank of them). c0_0, c0_1, c1_1, … are the covariates, cN_M is the covariate of the Nth and Mth varaibles, with 0 <= N <= M < rank.
Returns:: list of rank floating point values

multivariate_normal_np(cov: dict[str, Any]) → ndarray[tuple[int, ...], dtype[_ScalarType_co]]

Return an array of values chosen from the given covariates.

Parameters:: cov – Keys are rank: The number of values to output; mN: The mean of variable N (where N is between 0 and one less than rank). cN_M (where 0 < N <= M < rank): the covariance between the Nth and the Mth variables.
Returns:: A numpy array of results.

normal(mean: float, sd: float) → float

Choose a value according to a Gaussian (normal) distribution.

Parameters:

mean – The mean of the output values.
sd – The standard deviation of the output values.

Returns:

The output value.

root3 = 1.7320508075688772

truncated_string(subgen_fn: Callable[[...], list[T]], params: dict, length: int) → list[T]: Call subgen_fn(**params) and truncate the results to length.

uniform(low: float, high: float) → float

Choose a value according to a uniform distribution.

Parameters:

low – The lowest value that can be chosen.
high – The highest value that can be chosen.

Returns:

The output value.

uniform_ms(mean: float, sd: float) → float

Choose a value according to a uniform distribution.

Parameters:

mean – The mean of the output values.
sd – The standard deviation of the output values.

Returns:

The output value.

weighted_choice(a: list[dict[str, Any]]) → Any

Choice weighted by the count in the original dataset.

Parameters:: a – a list of dicts, each with a value key holding the value to be returned and a count key holding the number of that value found in the original dataset
Returns:: The chosen value.

with_constants_at(constants_at: dict[int, T], subgen: str, params: dict[str, T]) → list[T]

Insert constants into the results of a different generator.

Parameters:

constants_at – A dictionary of positions and objects to insert into the return list at those positions.
subgen – The name of the function to call to get the results that will have the constants inserted into.
params – Keyword arguments to the subgen function.

Returns:

A list of results from calling subgen(**params) with constants_at inserted in at the appropriate indices.

zipf_choice(a: list[collections.abc.Mapping[str, T]], n: int | None = None) → Optional[T]

Choose a value according to the Zipf distribution.

The nth value (starting from 1) is chosen with a frequency 1/n times as frequently as the first value is chosen.

Parameters:: a – The list of rows to choose between, most frequent first. Each element is a mapping with a key value and the key is the value to return.
Returns:: The chosen value.

zipf_choice_direct(a: list[T], n: int | None = None) → T

Choose a value according to the Zipf distribution.

The nth value (starting from 1) is chosen with a frequency 1/n times as frequently as the first value is chosen.

Parameters:: a – The list of values to output, most frequent first.
Returns:: The chosen value.

exception datafaker.providers.InappropriateGeneratorException

Bases: Exception

Exception thrown if a generator is requested that is not appropriate.

exception datafaker.providers.NothingToGenerateException(message: str)

Bases: Exception

Exception thrown when no value can be generated.

Bases: BaseProvider

A Mimesis provider that always returns None.

class Meta

Bases: object

Meta-class for NullProvider settings.

name = 'null_provider'

static null() → None: Return None.

Bases: BaseProvider

A Mimesis provider that samples from the results of a SQL GROUP BY query.

class Meta

Bases: object

Meta-class for SQLGroupByProvider settings.

name = 'sql_group_by_provider'

sample(group_by_result: list[dict[str, Any]], weights_column: str, value_columns: Optional[Union[str, list[str]]] = None, filter_dict: Optional[dict[str, Any]] = None) → Union[Any, dict[str, Any], tuple[Any, ...]]

Random sample a row from the result of a SQL GROUP BY query.

The result of the query is assumed to be in the format that datafaker’s make-stats outputs.

For example, if one executes the following src-stats query

SELECT COUNT(*) AS num, nationality, gender, age
FROM person
GROUP BY nationality, gender, age

and calls it the count_demographics query, one can then use

generic.sql_group_by_provider.sample(
    SRC_STATS["count_demographics"],
    weights_column="num",
    value_columns=["gender", "nationality"],
    filter_dict={"age": 23},
)

to restrict the results of the query to only people aged 23, and random sample a pair of gender and nationality values (returned as a tuple in that order), with the weights of the sampling given by the counts num.

Parameters:

group_by_result – Result of the query. A list of rows, with each row being a dictionary with names of columns as keys.
weights_column – Name of the column which holds the weights based on which to sample. Typically the result of a COUNT(*).
value_columns – Name(s) of the column(s) to include in the result. Either a string for a single column, an iterable of strings for multiple columns, or None for all columns (default).
filter_dict – Dictionary of {name_of_column: value_it_must_have}, to restrict the sampling to a subset of group_by_result. Optional.

Returns:

a single value if value_columns is a single column name,
a tuple of values in the same order as value_columns if value_columns is an iterable of strings.
a dictionary of {name_of_column: value} if value_columns is None

Bases: BaseProvider

A Mimesis provider of timedeltas.

class Meta

Bases: object

Meta-class for TimedeltaProvider settings.

name = 'timedelta_provider'

static timedelta(min_dt: timedelta = datetime.timedelta(0), max_dt: timedelta = datetime.timedelta(days=49710, seconds=23296)) → timedelta: Return a random timedelta object.

Bases: BaseProvider

A Mimesis provider for timespans.

A timespan consits of start datetime, end datetime, and the timedelta in between. Returns a 3-tuple.

class Meta

Bases: object

Meta-class for TimespanProvider settings.

name = 'timespan_provider'

static timespan(earliest_start_year: int, last_start_year: int, min_dt: timedelta = datetime.timedelta(0), max_dt: timedelta = datetime.timedelta(days=49710, seconds=23296)) → tuple[datetime.datetime, datetime.datetime, datetime.timedelta]: Return a timespan as a 3-tuple of (start, end, delta).

Bases: BaseProvider

A Mimesis provider for booleans with a given probability for True.

class Meta

Bases: object

Meta-class for WeightedBooleanProvider settings.

name = 'weighted_boolean_provider'

bool(probability: float) → bool: Return True with given probability, otherwise False.

datafaker.providers.merge_with_constants(xs: list[T], constants_at: dict[int, T]) → Generator[T, None, None]

Merge a list of items with other items that must be placed at certain indices.

Parameters:

constants_at – A map of indices to objects that must be placed at those indices.
xs – Items that fill in the gaps left by constants_at.

Returns:

xs with constants_at inserted at the appropriate points. If there are not enough elements in xs to fill in the gaps in constants_at, the elements of constants_at after the gap are dropped.

datafaker.providers.zipf_weights(size: int) → list[float]: Get the weights of a Zipf distribution of a given size.

datafaker.settings module

Utils for reading settings from environment variables.

See module pydantic for enforcing type hints at runtime. See module functools.lru_cache to save time and memory in case of repeated calls. See module typing for type hinting.

Classes:

Settings

Functions:

get_settings() -> Settings

class datafaker.settings.Settings(*args: Any, **kwargs: Any)

Bases: BaseSettings

A Pydantic settings class with optional and mandatory settings.

Settings class attributes describe two database connection. The source database connection is the database schema from which the object relational model is discovered. The destination database connection is the location where tables based on the ORM is created and synthetic values inserted.

src_dsn

A DSN for connecting to the source database.

Type:: str

src_schema

The source database schema to use, if applicable.

Type:: str

dst_dsn

A DSN for connecting to the destination database.

Type:: str

dst_schema

The destination database schema to use, if applicable.

Type:: str

class Config

Bases: object

Meta-settings for the Settings class.

validate_dst_dsn(dsn: Optional[str], values: Any) → Optional[str]: Create and validate the destination DB DSN.

validate_src_dsn(dsn: Optional[str], values: Any) → Optional[str]: Create and validate the source DB DSN.

datafaker.settings.get_settings() → Settings: Return the same Settings object every call.