sieve¶
Filter and transform a mysqldump stream.
This command processes mysqldump output, potentially filtering or transforming the output based on the provided command line options.
sieve effective works in two modes:
- streaming; mysqldump is read from
--input-file
and written to stdout possibly with different output depending on the provided options.- directory; mysqldump is read from
--input-file
and split into separate files in the requested directory. This allows converting a large dump in a file-per-table easily. Files output in this mode are additionally filtered through--compress-command
and are processed throughgzip --fast
by default so the output is compressed on disk by default.
Usage¶
Usage: dbsake sieve [options]
Filter and transform mysqldump output.
sieve can extract single tables from a mysqldump file and perform useful
transformations, such as adding indexes after the table data is loaded for
InnoDB tables, where such indexes can be created much more efficiently
than the default incremental rebuild that mysqldump performs.
Options:
-F, --format <name> Select the output format (directory, stream)
-C, --directory <path> Specify output directory when
--format=directory
-i, --input-file <path> Specify input file to process instead of
stdin
-z, --compress-command <name> Specify compression command when
--format=directory
-t, --table <glob> Only output tables matching the given glob
pattern
-T, --exclude-table <glob> Excludes tables matching the given glob
pattern
--defer-indexes Add secondary indexes after loading table
data
--defer-foreign-keys Add foreign key constraints after loading
table data
--write-binlog / --no-write-binlog
Include SQL_LOG_BIN = 0 in output to disable
binlog
--table-schema / --no-table-schema
Include/exclude table schema from output.
--table-data / --no-table-data Include/exclude table data from output
--routines / --no-routines Include / exclude database routines from
output
--events / --no-events Include / exclude database events from
output
--triggers / --no-triggers Include/exclude table triggers from output
--master-data / --no-master-data
Uncomment/comment CHANGE MASTER in input, if
present
-O, --to-stdout Force output on stdout, even to a terminal.
-?, --help Show this message and exit.
Example¶
$ mysqldump --routines sakila | dbsake sieve --format=directory --directory=backups/
$ tree backups
backups
└── sakila
├── actor.sql.gz
├── address.sql.gz
├── category.sql.gz
├── city.sql.gz
├── country.sql.gz
├── customer.sql.gz
├── film_actor.sql.gz
├── film_category.sql.gz
├── film.sql.gz
├── film_text.sql.gz
├── inventory.sql.gz
├── language.sql.gz
├── payment.sql.gz
├── rental.sql.gz
├── routines.ddl.gz
├── staff.sql.gz
├── store.sql.gz
└── views.ddl.gz
1 directory, 18 files
Options¶
Changed in version 2.0.0: Renamed split-mysqldump to sieve; Significant rewrite of functionality.
Changed in version 2.0.0: Remove –regex option in favor of -t/–table and -T/–exclude-table option which accepts globs.
-
-F
,
--format
<name>
¶ Output file format. Must be one of ‘stream’ or ‘directory’. If set to ‘stream’, output will be written on stdout. Unless –force is also specified the sieve command with refuse to write to a terminal.
If set to ‘directory’, output will be written to the path specified by the
--directory
option, with a file per table.
New in version 2.0.0.
-
-C
,
--directory
<output directory>
¶ Path where the sieve command should create output files. Ignored if
--format
is set to ‘stream’. The sieve command will create this path if it does not already exist.Defaults to ‘.’ - the current working directory.
-
-i
,
--input-file
<path>
¶ Input file to read mysqldump input from. Default to “-” and reads from stdin. This must be an uncompressed data source, so to process an already compressed .sql.gz file you might run it through “zcat backup.sql.gz | dbsake sieve [options…]”
New in version 2.0.0.
-
-z
,
--compress-command
<command>
¶ Filter output files through this command. If
--format
is not set to ‘directory’, then this option is ignored. The sieve command will detect most common compression command and create an appropriate extension on the output files. For example, –compress-command=gzip will create .sql.gz files under the path specified by--directory
.Defaults to “gzip -1”.
Changed in version 2.0.0: -f/–filter-command was renamed to -z/–compress-command
-
-t
,
--table
<glob pattern>
¶ f
--table
is specified, then only tables matching the provided glob pattern will be included in the output of the sieve command. Each table is qualified by the database name in “database.table” format and then compared against the glob pattern. For example, to include all tables in the “mysql” database you would specify –table=”mysql.*”.This option may be specified multiple times and sieve will include any table that matches at least one of these options so long as the table does not also match an
--exclude-table
option.If no –table options are provided, all tables are included in the output that do not otherwise match an
--exclude-table
pattern.
New in version 2.0.0.
-
-T
,
--exclude-table
<glob pattern>
¶ If
--exclude-table
is specified, then only tables not matching the provided glob pattern will be included in the output of the sieve command. Each table is qualified by the database name in “database.table” format and then compared against the glob pattern. For example, to exclude the mysql.user table from output you would specify the option: “–exclude-table=mysql.user”.This option may be specified multiple times and sieve will include any table that matches at least one of these options so long as the table does not also match an
--exclude-table
option.If no
--exclude-table
options are provided, all tables are included in the output that match at least one--table
pattern, or all output is included if neither--exclude-table
or--table
options are provided.
New in version 2.0.0.
-
--defer-indexes
¶
This option rewrites the output of CREATE TABLE statements and arranges for secondary indexes to be created after the table data is loaded. This causes an additional ALTER TABLE statement to be output after the table data section of each table, when there is at least one secondary index to be added.
If there are foreign key constraints on the table, associated indexes will not be deferred unless the
--defer-foreign-keys
option is also specified.This option only applies to InnoDB tables and is only efficient on MySQL 5.1+ (if the innodb plugin is enabled) or on MySQL 5.5+ (default InnoDB engine), where the fast alter path may be used.
-
--defer-foreign-keys
¶
This option rewrites the output of CREATE TABLE statements and adds foreign key constraints after the table data is loaded. This is primarily useful to allow deferring secondary indexes with associated foreign keys.
This option only makes sense if reloading a dump into MySQL 5.6+, othrewise adding indexes will require a full table rebuild and will end up being much slower than just reloading the mysqldump unaltered.
-
--write-binlog
/ --no-write-binlog
¶ If
--no-write-binlog
is set, sieve will output a SET SQL_LOG_BIN=0 SQL command to the beginning of the dump to avoid writing to the binary log when reloading the resulting output. Use the option with care, as the resulting dump will not replicate to a slave if this option is set.
New in version 2.0.0.
-
--table-schema
/ --no-table-schema
¶ If
--no-table-schema
is used, sieve will not output any CREATE TABLE statements and will not output any CREATE VIEW statements. Only table data, routines and events will be output (as dictated by other options).
New in version 2.0.0.
-
--table-data
/ --no-table-data
¶ If
--skip-table-data
is set, sieve will not output any table data sections and only output DDL. Reloading such a dump will result in empty tables.
New in version 2.0.0.
-
--master-data
/ --no-master-data
¶ If the
--master-data
option is set, any commented out CHANGE MASTER statements will be uncommented in the output. This is useful of setting up a replication slave from a backup created using –master-data=2.If the
--no-master-data
option is set, any CHANGE MASTER statements will be commented out in the output, ensuring no CHANGE MASTER is run. This is useful for dumps created with –master-data[=1].
New in version 2.0.0.
-
--routines
/ --no-routines
¶ Include or exclude routines from the output, if routines were found in the input file. By default routines are not excluded and will ony be excluded if the –no-routines option is specified. The –routines option used to cancel a previous –no-routines option.
New in version 2.0.0.
-
--events
/ --no-events
¶ Include or exclude events from the output, if events were found in the input file. By default events are not excluded and will ony be excluded if the –no-events option is specified. The –events option can be used to cancel a previous –no-events option.
New in version 2.0.0.
-
--triggers
/ --no-triggers
¶ Include or exclude table triggers from the output, if triggers were found in the input file. By default triggers are included for any output tables (subject to table filtering). –no-triggers will disable output for all triggers and –triggers can be used to cancel the effects of an earlier –no-triggers option.
New in version 2.0.0.
-
-O
,
--to-stdout
¶
The
--to-stdout
option will force output to be written to stdout even if stdout appears to be an active terminal. This can be useful in cases when filtering the mysqldump output or when not outputing large amounts of data and want to read it directly on the terminal. By default, the sieve command will abort if it detects that it would output to a terminal and –to-stdout is not used.