Kea  1.9.9-git
isc::util::VersionedCSVFile Class Reference

Implements a CSV file that supports multiple versions of the file's "schema". More...

#include <versioned_csv_file.h>

+ Inheritance diagram for isc::util::VersionedCSVFile:

Public Types

enum  InputSchemaState { CURRENT, NEEDS_UPGRADE, NEEDS_DOWNGRADE }
 Possible input file schema states. More...
 

Public Member Functions

 VersionedCSVFile (const std::string &filename)
 Constructor. More...
 
virtual ~VersionedCSVFile ()
 Destructor. More...
 
void addColumn (const std::string &col_name, const std::string &version, const std::string &default_value="")
 Adds metadata for a single column to the schema. More...
 
size_t getInputHeaderCount () const
 Returns the number of columns found in the input header. More...
 
enum InputSchemaState getInputSchemaState () const
 Fetches the state of the input file's schema. More...
 
std::string getInputSchemaVersion () const
 Returns the schema version of the physical file. More...
 
size_t getMinimumValidColumns () const
 Returns the minimum number of columns which must be present for the file to be considered valid. More...
 
std::string getSchemaVersion () const
 text version of current schema supported by the file's metadata More...
 
size_t getValidColumnCount () const
 Returns the number of valid columns found in the header For newly created files this will always match the number of defined columns (i.e. More...
 
const VersionedColumnPtrgetVersionedColumn (const size_t index) const
 Fetch the column descriptor for a given index. More...
 
bool needsConversion () const
 Returns true if the input file schema state is not CURRENT. More...
 
bool next (CSVRow &row)
 Reads next row from the file file. More...
 
virtual void open (const bool seek_to_end=false)
 Opens existing file or creates a new one. More...
 
virtual void recreate ()
 Creates a new CSV file. More...
 
void setMinimumValidColumns (const std::string &column_name)
 Sets the minimum number of valid columns based on a given column. More...
 
- Public Member Functions inherited from isc::util::CSVFile
 CSVFile (const std::string &filename)
 Constructor. More...
 
virtual ~CSVFile ()
 Destructor. More...
 
void addColumn (const std::string &col_name)
 Adds new column name. More...
 
void append (const CSVRow &row) const
 Writes the CSV row into the file. More...
 
void close ()
 Closes the CSV file. More...
 
bool exists () const
 Checks if the CSV file exists and can be opened for reading. More...
 
void flush () const
 Flushes a file. More...
 
size_t getColumnCount () const
 Returns the number of columns in the file. More...
 
size_t getColumnIndex (const std::string &col_name) const
 Returns the index of the column having specified name. More...
 
std::string getColumnName (const size_t col_index) const
 Returns the name of the column. More...
 
std::string getFilename () const
 Returns the path to the CSV file. More...
 
std::string getReadMsg () const
 Returns the description of the last error returned by the CSVFile::next function. More...
 
bool next (CSVRow &row, const bool skip_validation=false)
 Reads next row from CSV file. More...
 
void setReadMsg (const std::string &read_msg)
 Sets error message after row validation. More...
 

Protected Member Functions

void columnCountError (const CSVRow &row, const std::string &reason)
 Convenience method for adding an error message. More...
 
virtual bool validateHeader (const CSVRow &header)
 Validates the header of a VersionedCSVFile. More...
 
- Protected Member Functions inherited from isc::util::CSVFile
void addColumnInternal (const std::string &col_name)
 Adds a column regardless if the file is open or not. More...
 
virtual bool validate (const CSVRow &row)
 Validate the row read from a file. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from isc::util::CSVFile
static CSVRow EMPTY_ROW ()
 Represents empty row. More...
 

Detailed Description

Implements a CSV file that supports multiple versions of the file's "schema".

This allows files with older schemas to be upgraded to newer schemas as they are being read. The file's schema is defined through a list of column descriptors, or isc::util::VersionedColumn(s). Each descriptor contains metadata describing the column, consisting of the column's name, the version label in which the column was added to the schema, and a default value to be used if the column is missing from the file. Note that the column descriptors are defined in the order they occur in the file, when reading a row from left to right. This also assumes that when new version of the schema evolves, all new columns are added at the end of the row. In other words, the order of the columns reflects not only the order in which they occur in a row but also the order they were added to the schema. Conceptually, the entire list of columns defined constitutes the current schema. Earlier schema versions are therefore subsets of this list. Creating the schema is done by calling VersionedCSVfile::addColumn() for each column. Note that the schema must be defined prior to opening the file.

The first row of the file is always the header row and is a comma-separated list of the names of the column in the file. This row is used when opening the file via VersionedCSVFile::open(), to identify its schema version so that it may be be read correctly. This is done by comparing the column found in the header to the columns defined in the schema. The columns must match both by name and the order in which they occur.

  1. If there are fewer columns in the header than in the schema, the file is presumed to be an earlier schema version and will be upgraded as it is read. There is an ability to mark a specific column as being the minimum column which must be present, see VersionedCSVFile::setMinimumValidColumns(). If the header columns do not match up to this minimum column, the file is presumed to be too old to upgrade and the open will fail. A valid, upgradable file will have an input schema state of VersionedCSVFile::NEEDS_UPGRADE.
  2. If there is a mismatch between a found column name and the column name defined for that position in the row, the file is presumed to be invalid and the open will fail.
  3. If the content of the header matches exactly the columns defined in the schema, the file is considered to match the schema exactly and the input schema state will VersionedCSVFile::CURRENT.
  4. If there columns in the header beyond all of the columns defined in the schema (i.e the schema is a subset of the header), then the file is presumed to be from a newer version of Kea and can be downgraded. The input schema state fo the file will be set to VersionedCSVFile::NEEDS_DOWNGRADE.

After successfully opening a file, rows are read one at a time via VersionedCSVFile::next() and handled according to the input schema state. Each data row is expected to have at least the same number of columns as were found in the header. Any row which as fewer values is discarded as invalid. Similarly, any row which is found to have more values than were found in the header is discarded as invalid.

When upgrading a row, the values for each missing column is filled in with the default value specified by that column's descriptor. When downgrading a row, extraneous values are dropped from the row.

It is important to note that upgrading or downgrading a file does NOT alter the physical file itself. Rather the conversion occurs after the raw data has been read but before it is passed to caller.

Also note that there is currently no support for writing out a file in anything other than the current schema.

Definition at line 120 of file versioned_csv_file.h.

Member Enumeration Documentation

Possible input file schema states.

Used to categorize the input file's schema, relative to the defined schema.

Enumerator
CURRENT 
NEEDS_UPGRADE 
NEEDS_DOWNGRADE 

Definition at line 126 of file versioned_csv_file.h.

Constructor & Destructor Documentation

isc::util::VersionedCSVFile::VersionedCSVFile ( const std::string &  filename)

Constructor.

Parameters
filenameCSV file name.

Definition at line 14 of file versioned_csv_file.cc.

isc::util::VersionedCSVFile::~VersionedCSVFile ( )
virtual

Destructor.

Definition at line 20 of file versioned_csv_file.cc.

Member Function Documentation

void isc::util::VersionedCSVFile::addColumn ( const std::string &  col_name,
const std::string &  version,
const std::string &  default_value = "" 
)

Adds metadata for a single column to the schema.

This method appends a new column description to the file's schema. Note this does not cause anything to be written to the physical file. The name of the column will be placed in the CSV header when new file is created by calling recreate or open function.

Parameters
col_nameName of the column.
versionText representation of the schema version in which this column first appeared.
default_valuevalue the missing column should be given during an upgrade. It defaults to an empty string, ""
Exceptions
CSVFileErrorif a column with the specified name exists.

Definition at line 24 of file versioned_csv_file.cc.

References isc::util::CSVFile::addColumn().

+ Here is the call graph for this function:

void isc::util::VersionedCSVFile::columnCountError ( const CSVRow row,
const std::string &  reason 
)
protected

Convenience method for adding an error message.

Constructs an error message indicating that the number of columns in a given row are wrong and why, then adds it readMsg.

Parameters
rowThe row in error
reasonAn explanation as to why the row column count is wrong

Definition at line 182 of file versioned_csv_file.cc.

References isc::util::CSVFile::getFilename(), isc::util::CSVRow::getValuesCount(), and isc::util::CSVFile::setReadMsg().

Referenced by next().

+ Here is the call graph for this function:

size_t isc::util::VersionedCSVFile::getInputHeaderCount ( ) const

Returns the number of columns found in the input header.

Definition at line 56 of file versioned_csv_file.cc.

Referenced by next(), and validateHeader().

VersionedCSVFile::InputSchemaState isc::util::VersionedCSVFile::getInputSchemaState ( ) const

Fetches the state of the input file's schema.

Reflects that state of the input file's schema relative to the defined schema as a enum, InputSchemaState.

Returns
VersionedCSVFile::CURRENT if the input file schema matches the defined schema, NEEDS_UPGRADE if the input file schema is older, and NEEDS_DOWNGRADE if it is newer

Definition at line 85 of file versioned_csv_file.cc.

Referenced by next().

std::string isc::util::VersionedCSVFile::getInputSchemaVersion ( ) const

Returns the schema version of the physical file.

Returns
text version of the schema found or string "undefined" if the file has not been opened

Definition at line 95 of file versioned_csv_file.cc.

References getValidColumnCount(), and getVersionedColumn().

+ Here is the call graph for this function:

size_t isc::util::VersionedCSVFile::getMinimumValidColumns ( ) const

Returns the minimum number of columns which must be present for the file to be considered valid.

Definition at line 46 of file versioned_csv_file.cc.

Referenced by validateHeader().

std::string isc::util::VersionedCSVFile::getSchemaVersion ( ) const

text version of current schema supported by the file's metadata

Returns
text version info assigned to the last column in the list of defined column, or the string "undefined" if no columns have been defined.

Definition at line 104 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnCount(), and getVersionedColumn().

+ Here is the call graph for this function:

size_t isc::util::VersionedCSVFile::getValidColumnCount ( ) const

Returns the number of valid columns found in the header For newly created files this will always match the number of defined columns (i.e.

getColumnCount()). For existing files, this will be the number of columns in the header that match the defined columns. When this number is less than getColumnCount() it means the input file is from an earlier schema. This value is zero until the file has been opened.

Definition at line 51 of file versioned_csv_file.cc.

Referenced by getInputSchemaVersion(), next(), and validateHeader().

const VersionedColumnPtr & isc::util::VersionedCSVFile::getVersionedColumn ( const size_t  index) const

Fetch the column descriptor for a given index.

Parameters
indexindex within the list of columns of the desired column
Returns
a pointer to the VersionedColumn at the given index
Exceptions
OutOfRangeexception if the index is invalid

Definition at line 113 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), and isc_throw.

Referenced by getInputSchemaVersion(), and getSchemaVersion().

+ Here is the call graph for this function:

bool isc::util::VersionedCSVFile::needsConversion ( ) const

Returns true if the input file schema state is not CURRENT.

Definition at line 90 of file versioned_csv_file.cc.

References CURRENT.

bool isc::util::VersionedCSVFile::next ( CSVRow row)

Reads next row from the file file.

This function will return the CSVRow object representing a parsed row if parsing is successful. If the end of file has been reached, the empty row is returned (a row containing no values).

  1. If the row has fewer values than were found in the header it is discarded as invalid.
  2. If the row is found to have more values than are defined in the schema it is discarded as invalid

When a valid row has fewer than the defined number of columns, the values for each missing column is filled in with the default value specified by that column's descriptor.

Parameters
[out]rowObject receiving the parsed CSV file.
Returns
true if row has been read and validated; false if validation failed.

Definition at line 124 of file versioned_csv_file.cc.

References isc::util::CSVRow::append(), columnCountError(), CURRENT, isc::util::CSVFile::EMPTY_ROW(), isc::util::CSVFile::getColumnCount(), getInputHeaderCount(), getInputSchemaState(), getValidColumnCount(), isc::util::CSVRow::getValuesCount(), NEEDS_DOWNGRADE, NEEDS_UPGRADE, isc::util::CSVFile::next(), isc::util::CSVFile::setReadMsg(), and isc::util::CSVRow::trim().

+ Here is the call graph for this function:

void isc::util::VersionedCSVFile::open ( const bool  seek_to_end = false)
virtual

Opens existing file or creates a new one.

This function will try to open existing file if this file has size greater than 0. If the file doesn't exist or has size of 0, the file is recreated. If the existing file has been opened, the header is parsed and and validated against the schema. By default, the data pointer in the file is set to the beginning of the first data row. In order to retrieve the row contents the next function should be called. If a seek_to_end parameter is set to true, the file will be opened and the internal pointer will be set to the end of file.

Parameters
seek_to_endA boolean value which indicates if the input and output file pointer should be set at the end of file.
Exceptions
VersionedCSVFileErrorif schema has not been defined, CSVFileError when IO operation fails, or header fails to validate.

Reimplemented from isc::util::CSVFile.

Reimplemented in isc::dhcp::CSVLeaseFile4, and isc::dhcp::CSVLeaseFile6.

Definition at line 61 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), isc_throw, and isc::util::CSVFile::open().

+ Here is the call graph for this function:

void isc::util::VersionedCSVFile::recreate ( )
virtual

Creates a new CSV file.

The file creation will fail if there are no columns specified. Otherwise, this function will write the header to the file. In order to write rows to opened file, the append function should be called.

Exceptions
VersionedCSVFileErrorif schema has not been defined CSVFileError if an IO operation fails

Reimplemented from isc::util::CSVFile.

Definition at line 72 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getFilename(), isc_throw, and isc::util::CSVFile::recreate().

+ Here is the call graph for this function:

void isc::util::VersionedCSVFile::setMinimumValidColumns ( const std::string &  column_name)

Sets the minimum number of valid columns based on a given column.

Parameters
column_nameName of the column which positionally represents the minimum columns which must be present in a file and to be considered valid.

Definition at line 33 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnIndex(), and isc_throw.

+ Here is the call graph for this function:

bool isc::util::VersionedCSVFile::validateHeader ( const CSVRow header)
protectedvirtual

Validates the header of a VersionedCSVFile.

This function is called internally when the reading in an existing file. It parses the header row of the file, comparing each value in succession against the defined list of columns. If the header contains too few matching columns (i.e. less than minimum_valid_columns_) or too many (more than the number of defined columns), the file is presumed to be either too old, too new, or too corrupt to process. Otherwise it retains the number of valid columns found and deems the header valid.

Parameters
headerA row holding a header.
Returns
true if header matches the columns; false otherwise.

Reimplemented from isc::util::CSVFile.

Definition at line 192 of file versioned_csv_file.cc.

References isc::util::CSVFile::getColumnCount(), isc::util::CSVFile::getColumnName(), getInputHeaderCount(), getMinimumValidColumns(), getValidColumnCount(), isc::util::CSVRow::getValuesCount(), isc_throw, NEEDS_DOWNGRADE, NEEDS_UPGRADE, isc::util::CSVRow::readAt(), and isc::util::CSVFile::setReadMsg().

+ Here is the call graph for this function:


The documentation for this class was generated from the following files: