pyemma.coordinates.data.PyCSVReader¶
-
class
pyemma.coordinates.data.PyCSVReader(*args, **kwargs)¶ Reader for tabulated ASCII data
This class uses numpy to interpret string data to array data.
- Parameters
filenames (str or list of str) – files to be read
chunksize (int, optional) – how much lines to process at once
delimiters (str, list of str or None) –
if str (eg. ‘t’), then this delimiter is used for all filenames.
list of delimiter strings, the length has to match the length of filenames.
if not given, it will be guessed (may fail eg. for 1 dimensional data).
comments (str, list of str or None, default='#') – Lines starting with this char will be ignored, except for first line (header)
converters (dict, optional (Not yet implemented)) – A dictionary mapping column number to a function that will convert that column to a float. E.g., if column 0 is a date string:
converters = {0: datestr2num}. Converters can also be used to provide a default value for missing data:converters = {3: lambda s: float(s.strip() or 0)}.
Notes
For reading files with only one column, one needs to specify a delimter…
-
__init__(filenames, chunksize=1000, delimiters=None, comments='#', converters=None, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
_Loggable__create_logger()_PyCSVReader__parse_args(arg, default, n)_SerializableMixIn__interpolate(state, klass)__delattr__(name, /)Implement delattr(self, name).
__dir__()Default dir() implementation.
__eq__(value, /)Return self==value.
__format__(format_spec, /)Default object formatter.
__ge__(value, /)Return self>=value.
__getattribute__(name, /)Return getattr(self, name).
__getstate__()__gt__(value, /)Return self>value.
__hash__()Return hash(self).
__init__(filenames[, chunksize, delimiters, …])Initialize self.
__init_subclass__(*args, **kwargs)This method is called when a class is subclassed.
__iter__()__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(value, /)Return self!=value.
__new__(cls, *args, **kwargs)Create and return a new object.
__reduce__()Helper for pickle.
__reduce_ex__(protocol, /)Helper for pickle.
__repr__()Return repr(self).
__setattr__(name, value, /)Implement setattr(self, name, value).
__setstate__(state)__sizeof__()Size of object in memory, in bytes.
__str__()Return str(self).
__subclasshook__Abstract classes can override this to customize issubclass().
_calc_offsets(fh)determines byte offsets between all lines :param fh: :type fh: file handle :param file handle to obtain byte offsets from.:
_chunk_finite(data)_cleanup_logger(logger_id, logger_name)_clear_in_memory()_compute_default_cs(dim, itemsize[, logger])_create_iterator([skip, chunk, stride, …])Should be implemented by non-abstract subclasses.
_data_flow_chain()Get a list of all elements in the data flow graph.
_determine_dialect(fh, length)- param fh
file handle for which the dialect should be determined.
_get_classes_to_inspect()gets classes self derives from which 1.
_get_dialect(itraj)_get_dimension(fh, dialect, skip)_get_interpolation_map(cls)_get_private_field(cls, name[, default])_get_serialize_fields(cls)_get_state_of_serializeable_fields(klass, state):return a dictionary {k:v} for k in self.serialize_fields and v=getattr(self, k)
_get_traj_info(filename)_get_version(cls[, require])_get_version_for_class_from_state(state, klass)retrieves the version of the current klass from the state mapping from old locations to new ones.
_logger_is_active(level)@param level: int log level (debug=10, info=20, warn=30, error=40, critical=50)
_map_to_memory([stride])Maps results to memory.
_set_state_from_serializeable_fields_and_state(…)set only fields from state, which are present in klass.__serialize_fields
_source_from_memory([data_producer])describe()dimension()get_output([dimensions, stride, skip, chunk])Maps all input data of this transformer and returns it as an array or list of arrays
iterator([stride, lag, chunk, …])creates an iterator to stream over the (transformed) data.
load(file_name[, model_name])Loads a previously saved PyEMMA object from disk.
n_chunks(chunksize[, stride, skip])how many chunks an iterator of this sourcde will output, starting (eg.
n_frames_total([stride, skip])Returns total number of frames.
number_of_trajectories([stride])Returns the number of trajectories.
output_type()By default transformers return single precision floats.
save(file_name[, model_name, overwrite, …])saves the current state of this object to given file and name.
trajectory_length(itraj[, stride, skip])Returns the length of trajectory of the requested index.
trajectory_lengths([stride, skip])Returns the length of each trajectory.
write_to_csv([filename, extension, …])write all data to csv with numpy.savetxt
write_to_hdf5(filename[, group, …])writes all data of this Iterable to a given HDF5 file.
Attributes
DEFAULT_OPEN_MODE_DataSource__serialize_fields_FALLBACK_CHUNKSIZE_InMemoryMixin__serialize_fields_InMemoryMixin__serialize_version_Loggable__ids_Loggable__refs_PyCSVReader__serialize_version_SerializableMixIn__serialize_fields_SerializableMixIn__serialize_modifications_map_SerializableMixIn__serialize_version__abstractmethods____dict____doc____module____weakref__list of weak references to the object (if defined)
_abc_impl_loglevel_CRITICAL_loglevel_DEBUG_loglevel_ERROR_loglevel_INFO_loglevel_WARN_save_data_producer_serialize_versionchunksizedata_producerThe data producer for this data source object (can be another data source object).
default_chunksizeHow much data will be processed at once, in case no chunksize has been provided.
filenameslist of file names the data is originally being read from.
in_memoryare results stored in memory?
is_random_accessibleCheck if self._is_random_accessible is set to true and if all the random access strategies are implemented.
is_readerProperty telling if this data source is a reader or not.
loggerThe logger for this class instance
nameThe name of this instance
ndimntrajra_itraj_cuboidImplementation of random access with slicing that can be up to 3-dimensional, where the first dimension corresponds to the trajectory index, the second dimension corresponds to the frames and the third dimension corresponds to the dimensions of the frames.
ra_itraj_jaggedBehaves like ra_itraj_cuboid just that the trajectories are not truncated and returned as a list.
ra_itraj_linearImplementation of random access that takes arguments as the default random access (i.e., up to three dimensions with trajs, frames and dims, respectively), but which considers the frame indexing to be contiguous.
ra_linearImplementation of random access that takes a (maximal) two-dimensional slice where the first component corresponds to the frames and the second component corresponds to the dimensions.