Inheritance diagram for lsst.datarel.datasetScanner.HfsScanner:

Public Member Functions
def	__init__ (self, template)

def	walk (self, root, rules=None)

Private Attributes
	_formatKeys

	_pathComponents

Detailed Description

A hierarchical scanner for paths matching a template, optionally
also restricting visited paths to those matching a list of dataId rules.

Definition at line 211 of file datasetScanner.py.

Constructor & Destructor Documentation

◆ init()

def lsst.datarel.datasetScanner.HfsScanner.__init__	(	self,
		template
	)

Build an FsScanner for given a path template. The path template
should be a Python string with named format substitution
specifications, as used in mapper policy files. For example:

deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits

Note that a key may appear multiple times. If it does,
the value for each occurrence should be identical (the formatting
specs must be identical). Octal, binary, hexadecimal, and floating
point formats are not supported.

Definition at line 216 of file datasetScanner.py.

     def __init__(self, template):
         """Build an FsScanner for given a path template. The path template
         should be a Python string with named format substitution
         specifications, as used in mapper policy files. For example:
 
         deepCoadd-results/%(filter)s/%(tract)d/%(patch)s/calexp-%(filter)s-%(tract)d-%(patch)s.fits
 
         Note that a key may appear multiple times. If it does,
         the value for each occurrence should be identical (the formatting
         specs must be identical). Octal, binary, hexadecimal, and floating
         point formats are not supported.
         """
         template = os.path.normpath(template)
         if (len(template) == 0 or
                 template == os.curdir or
                 template[0] == os.sep or
                 template[-1] == os.sep):
             raise RuntimeError(
                 'Path template is empty, absolute, or identifies a directory')
         self._formatKeys = {}
         self._pathComponents = []
         fmt = re.compile(r'%\((\w+)\).*?([diucrs])')
 
         # split path into components
         for component in template.split(os.sep):
             # search for all occurences of a format spec
             simple = True
             last = 0
             regex = ''
             newKeys = []
             for m in fmt.finditer(component):
                 simple = False
                 spec = m.group(0)
                 k = m.group(1)
                 seenBefore = k in self._formatKeys
                 # transform format spec into a regular expression
                 regex += re.escape(component[last:m.start(0)])
                 last = m.end(0)
                 regex += '('
                 if seenBefore:
                     regex += '?:'
                 if m.group(2) in 'crs':
                     munge = _mungeStr
                     typ = str
                     regex += r'.+)'
                 else:
                     munge = _mungeInt
                     typ = int
                     regex += r'[+-]?\d+)'
                 if seenBefore:
                     # check consistency of formatting spec across key occurences
                     if spec[-1] != self._formatKeys[k].spec[-1]:
                         raise RuntimeError(
                             'Path template contains inconsistent format type-codes '
                             'for the same key')
                 else:
                     newKeys.append(k)
                     self._formatKeys[k] = _FormatKey(spec, typ, munge)
             regex += re.escape(component[last:])
             if simple:
                 regex = component  # literal match
             else:
                 regex = re.compile('^' + regex + '$')
             self._pathComponents.append(_PathComponent(newKeys, regex, simple))
 

Member Function Documentation

◆ walk()

def lsst.datarel.datasetScanner.HfsScanner.walk	(	self,
		root,
		rules = `None`
	)

Generator that descends the given root directory in top-down
fashion, matching paths corresponding to the template and satisfying
the given rule list. The generator yields tuples of the form
(path, dataId), where path is a dataset file name relative to root,
and dataId is a key value dictionary identifying the file.

Definition at line 281 of file datasetScanner.py.

     def walk(self, root, rules=None):
         """Generator that descends the given root directory in top-down
         fashion, matching paths corresponding to the template and satisfying
         the given rule list. The generator yields tuples of the form
         (path, dataId), where path is a dataset file name relative to root,
         and dataId is a key value dictionary identifying the file.
         """
         oneFound = False
         while os.path.exists(root) and not oneFound:
             stack = [(0, root, rules, {})]
             while stack:
                 depth, path, rules, dataId = stack.pop()
                 if os.path.isfile(path):
                     continue
                 pc = self._pathComponents[depth]
                 if pc.simple:
                     # No need to list directory contents
                     entries = [pc.regex]
                     if not os.path.exists(os.path.join(path, pc.regex)):
                         continue
                 else:
                     entries = os.listdir(path)
                 depth += 1
                 for e in entries:
                     subRules = rules
                     subDataId = dataId
                     if not pc.simple:
                         # make sure e matches path component regular expression
                         m = pc.regex.match(e)
                         if not m:
                             continue
                         # got a match - update dataId with new key values (if any)
                         try:
                             for i, k in enumerate(pc.keys):
                                 subDataId = self._formatKeys[k].munge(k, m.group(i + 1), subDataId)
                         except:
                             # Munger raises if value is invalid for key, so
                             # not really a match
                             continue
                         if subRules and pc.keys:
                             # have dataId rules and saw new keys; filter rule list
                             for k in subDataId:
                                 newRules = []
                                 for r in subRules:
                                     if k not in r or subDataId[k] in r[k]:
                                         newRules.append(r)
                                 subRules = newRules
                             if not subRules:
                                 continue  # no rules matched
                     # Have path matching template and at least one rule
                     p = os.path.join(path, e)
                     if depth < len(self._pathComponents):
                         # recurse
                         stack.append((depth, p, subRules, subDataId))
                     elif depth == len(self._pathComponents):
                         if os.path.isfile(p):
                             # found a matching file, yield it
                             yield os.path.relpath(p, root), subDataId
                             oneFound = True
             # end while stack
             root = os.path.join(root, "_parent")
 
 
 # -- Camera specific dataId mungers ----
 

Member Data Documentation

◆ _formatKeys

lsst.datarel.datasetScanner.HfsScanner._formatKeys

private

Definition at line 235 of file datasetScanner.py.

◆ _pathComponents

lsst.datarel.datasetScanner.HfsScanner._pathComponents

private

Definition at line 236 of file datasetScanner.py.

The documentation for this class was generated from the following file:

/home/jenkins-slave/snowflake/release/lsstsw/stack/Linux64/datarel/14.0+77/python/lsst/datarel/datasetScanner.py

Public Member Functions

Private Attributes