Kea
1.9.9-git
|
Implements a CSV file that supports multiple versions of the file's "schema". More...
#include <versioned_csv_file.h>
Public Types | |
enum | InputSchemaState { CURRENT, NEEDS_UPGRADE, NEEDS_DOWNGRADE } |
Possible input file schema states. More... | |
Public Member Functions | |
VersionedCSVFile (const std::string &filename) | |
Constructor. More... | |
virtual | ~VersionedCSVFile () |
Destructor. More... | |
void | addColumn (const std::string &col_name, const std::string &version, const std::string &default_value="") |
Adds metadata for a single column to the schema. More... | |
size_t | getInputHeaderCount () const |
Returns the number of columns found in the input header. More... | |
enum InputSchemaState | getInputSchemaState () const |
Fetches the state of the input file's schema. More... | |
std::string | getInputSchemaVersion () const |
Returns the schema version of the physical file. More... | |
size_t | getMinimumValidColumns () const |
Returns the minimum number of columns which must be present for the file to be considered valid. More... | |
std::string | getSchemaVersion () const |
text version of current schema supported by the file's metadata More... | |
size_t | getValidColumnCount () const |
Returns the number of valid columns found in the header For newly created files this will always match the number of defined columns (i.e. More... | |
const VersionedColumnPtr & | getVersionedColumn (const size_t index) const |
Fetch the column descriptor for a given index. More... | |
bool | needsConversion () const |
Returns true if the input file schema state is not CURRENT. More... | |
bool | next (CSVRow &row) |
Reads next row from the file file. More... | |
virtual void | open (const bool seek_to_end=false) |
Opens existing file or creates a new one. More... | |
virtual void | recreate () |
Creates a new CSV file. More... | |
void | setMinimumValidColumns (const std::string &column_name) |
Sets the minimum number of valid columns based on a given column. More... | |
Public Member Functions inherited from isc::util::CSVFile | |
CSVFile (const std::string &filename) | |
Constructor. More... | |
virtual | ~CSVFile () |
Destructor. More... | |
void | addColumn (const std::string &col_name) |
Adds new column name. More... | |
void | append (const CSVRow &row) const |
Writes the CSV row into the file. More... | |
void | close () |
Closes the CSV file. More... | |
bool | exists () const |
Checks if the CSV file exists and can be opened for reading. More... | |
void | flush () const |
Flushes a file. More... | |
size_t | getColumnCount () const |
Returns the number of columns in the file. More... | |
size_t | getColumnIndex (const std::string &col_name) const |
Returns the index of the column having specified name. More... | |
std::string | getColumnName (const size_t col_index) const |
Returns the name of the column. More... | |
std::string | getFilename () const |
Returns the path to the CSV file. More... | |
std::string | getReadMsg () const |
Returns the description of the last error returned by the CSVFile::next function. More... | |
bool | next (CSVRow &row, const bool skip_validation=false) |
Reads next row from CSV file. More... | |
void | setReadMsg (const std::string &read_msg) |
Sets error message after row validation. More... | |
Protected Member Functions | |
void | columnCountError (const CSVRow &row, const std::string &reason) |
Convenience method for adding an error message. More... | |
virtual bool | validateHeader (const CSVRow &header) |
Validates the header of a VersionedCSVFile. More... | |
Protected Member Functions inherited from isc::util::CSVFile | |
void | addColumnInternal (const std::string &col_name) |
Adds a column regardless if the file is open or not. More... | |
virtual bool | validate (const CSVRow &row) |
Validate the row read from a file. More... | |
Additional Inherited Members | |
Static Public Member Functions inherited from isc::util::CSVFile | |
static CSVRow | EMPTY_ROW () |
Represents empty row. More... | |
Implements a CSV file that supports multiple versions of the file's "schema".
This allows files with older schemas to be upgraded to newer schemas as they are being read. The file's schema is defined through a list of column descriptors, or isc::util::VersionedColumn(s). Each descriptor contains metadata describing the column, consisting of the column's name, the version label in which the column was added to the schema, and a default value to be used if the column is missing from the file. Note that the column descriptors are defined in the order they occur in the file, when reading a row from left to right. This also assumes that when new version of the schema evolves, all new columns are added at the end of the row. In other words, the order of the columns reflects not only the order in which they occur in a row but also the order they were added to the schema. Conceptually, the entire list of columns defined constitutes the current schema. Earlier schema versions are therefore subsets of this list. Creating the schema is done by calling VersionedCSVfile::addColumn() for each column. Note that the schema must be defined prior to opening the file.
The first row of the file is always the header row and is a comma-separated list of the names of the column in the file. This row is used when opening the file via VersionedCSVFile::open(), to identify its schema version so that it may be be read correctly. This is done by comparing the column found in the header to the columns defined in the schema. The columns must match both by name and the order in which they occur.
After successfully opening a file, rows are read one at a time via VersionedCSVFile::next() and handled according to the input schema state. Each data row is expected to have at least the same number of columns as were found in the header. Any row which as fewer values is discarded as invalid. Similarly, any row which is found to have more values than were found in the header is discarded as invalid.
When upgrading a row, the values for each missing column is filled in with the default value specified by that column's descriptor. When downgrading a row, extraneous values are dropped from the row.
It is important to note that upgrading or downgrading a file does NOT alter the physical file itself. Rather the conversion occurs after the raw data has been read but before it is passed to caller.
Also note that there is currently no support for writing out a file in anything other than the current schema.
Definition at line 120 of file versioned_csv_file.h.
Possible input file schema states.
Used to categorize the input file's schema, relative to the defined schema.
Enumerator | |
---|---|
CURRENT | |
NEEDS_UPGRADE | |
NEEDS_DOWNGRADE |
Definition at line 126 of file versioned_csv_file.h.
isc::util::VersionedCSVFile::VersionedCSVFile | ( | const std::string & | filename | ) |
|
virtual |
Destructor.
Definition at line 20 of file versioned_csv_file.cc.
void isc::util::VersionedCSVFile::addColumn | ( | const std::string & | col_name, |
const std::string & | version, | ||
const std::string & | default_value = "" |
||
) |
Adds metadata for a single column to the schema.
This method appends a new column description to the file's schema. Note this does not cause anything to be written to the physical file. The name of the column will be placed in the CSV header when new file is created by calling recreate
or open
function.
col_name | Name of the column. |
version | Text representation of the schema version in which this column first appeared. |
default_value | value the missing column should be given during an upgrade. It defaults to an empty string, "" |
CSVFileError | if a column with the specified name exists. |
Definition at line 24 of file versioned_csv_file.cc.
References isc::util::CSVFile::addColumn().
|
protected |
Convenience method for adding an error message.
Constructs an error message indicating that the number of columns in a given row are wrong and why, then adds it readMsg.
row | The row in error |
reason | An explanation as to why the row column count is wrong |
Definition at line 182 of file versioned_csv_file.cc.
References isc::util::CSVFile::getFilename(), isc::util::CSVRow::getValuesCount(), and isc::util::CSVFile::setReadMsg().
Referenced by next().
size_t isc::util::VersionedCSVFile::getInputHeaderCount | ( | ) | const |
Returns the number of columns found in the input header.
Definition at line 56 of file versioned_csv_file.cc.
Referenced by next(), and validateHeader().
VersionedCSVFile::InputSchemaState isc::util::VersionedCSVFile::getInputSchemaState | ( | ) | const |
Fetches the state of the input file's schema.
Reflects that state of the input file's schema relative to the defined schema as a enum, InputSchemaState.
Definition at line 85 of file versioned_csv_file.cc.
Referenced by next().
std::string isc::util::VersionedCSVFile::getInputSchemaVersion | ( | ) | const |
Returns the schema version of the physical file.
Definition at line 95 of file versioned_csv_file.cc.
References getValidColumnCount(), and getVersionedColumn().
size_t isc::util::VersionedCSVFile::getMinimumValidColumns | ( | ) | const |
Returns the minimum number of columns which must be present for the file to be considered valid.
Definition at line 46 of file versioned_csv_file.cc.
Referenced by validateHeader().
std::string isc::util::VersionedCSVFile::getSchemaVersion | ( | ) | const |
text version of current schema supported by the file's metadata
Definition at line 104 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnCount(), and getVersionedColumn().
size_t isc::util::VersionedCSVFile::getValidColumnCount | ( | ) | const |
Returns the number of valid columns found in the header For newly created files this will always match the number of defined columns (i.e.
getColumnCount()). For existing files, this will be the number of columns in the header that match the defined columns. When this number is less than getColumnCount() it means the input file is from an earlier schema. This value is zero until the file has been opened.
Definition at line 51 of file versioned_csv_file.cc.
Referenced by getInputSchemaVersion(), next(), and validateHeader().
const VersionedColumnPtr & isc::util::VersionedCSVFile::getVersionedColumn | ( | const size_t | index | ) | const |
Fetch the column descriptor for a given index.
index | index within the list of columns of the desired column |
OutOfRange | exception if the index is invalid |
Definition at line 113 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), and isc_throw.
Referenced by getInputSchemaVersion(), and getSchemaVersion().
bool isc::util::VersionedCSVFile::needsConversion | ( | ) | const |
Returns true if the input file schema state is not CURRENT.
Definition at line 90 of file versioned_csv_file.cc.
References CURRENT.
bool isc::util::VersionedCSVFile::next | ( | CSVRow & | row | ) |
Reads next row from the file file.
This function will return the CSVRow
object representing a parsed row if parsing is successful. If the end of file has been reached, the empty row is returned (a row containing no values).
When a valid row has fewer than the defined number of columns, the values for each missing column is filled in with the default value specified by that column's descriptor.
[out] | row | Object receiving the parsed CSV file. |
Definition at line 124 of file versioned_csv_file.cc.
References isc::util::CSVRow::append(), columnCountError(), CURRENT, isc::util::CSVFile::EMPTY_ROW(), isc::util::CSVFile::getColumnCount(), getInputHeaderCount(), getInputSchemaState(), getValidColumnCount(), isc::util::CSVRow::getValuesCount(), NEEDS_DOWNGRADE, NEEDS_UPGRADE, isc::util::CSVFile::next(), isc::util::CSVFile::setReadMsg(), and isc::util::CSVRow::trim().
|
virtual |
Opens existing file or creates a new one.
This function will try to open existing file if this file has size greater than 0. If the file doesn't exist or has size of 0, the file is recreated. If the existing file has been opened, the header is parsed and and validated against the schema. By default, the data pointer in the file is set to the beginning of the first data row. In order to retrieve the row contents the next
function should be called. If a seek_to_end
parameter is set to true, the file will be opened and the internal pointer will be set to the end of file.
seek_to_end | A boolean value which indicates if the input and output file pointer should be set at the end of file. |
VersionedCSVFileError | if schema has not been defined, CSVFileError when IO operation fails, or header fails to validate. |
Reimplemented from isc::util::CSVFile.
Reimplemented in isc::dhcp::CSVLeaseFile4, and isc::dhcp::CSVLeaseFile6.
Definition at line 61 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), isc_throw, and isc::util::CSVFile::open().
|
virtual |
Creates a new CSV file.
The file creation will fail if there are no columns specified. Otherwise, this function will write the header to the file. In order to write rows to opened file, the append
function should be called.
VersionedCSVFileError | if schema has not been defined CSVFileError if an IO operation fails |
Reimplemented from isc::util::CSVFile.
Definition at line 72 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), isc_throw, and isc::util::CSVFile::recreate().
void isc::util::VersionedCSVFile::setMinimumValidColumns | ( | const std::string & | column_name | ) |
Sets the minimum number of valid columns based on a given column.
column_name | Name of the column which positionally represents the minimum columns which must be present in a file and to be considered valid. |
Definition at line 33 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnIndex(), and isc_throw.
|
protectedvirtual |
Validates the header of a VersionedCSVFile.
This function is called internally when the reading in an existing file. It parses the header row of the file, comparing each value in succession against the defined list of columns. If the header contains too few matching columns (i.e. less than minimum_valid_columns_
) or too many (more than the number of defined columns), the file is presumed to be either too old, too new, or too corrupt to process. Otherwise it retains the number of valid columns found and deems the header valid.
header | A row holding a header. |
Reimplemented from isc::util::CSVFile.
Definition at line 192 of file versioned_csv_file.cc.
References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getColumnName(), getInputHeaderCount(), getMinimumValidColumns(), getValidColumnCount(), isc::util::CSVRow::getValuesCount(), isc_throw, NEEDS_DOWNGRADE, NEEDS_UPGRADE, isc::util::CSVRow::readAt(), and isc::util::CSVFile::setReadMsg().