ged2doc.input¶
Module which handles input files.
This module is responsible for locating all files (GEDCOM data and images) given the application inputs. Currently it handles two cases:
Input is specified as path to GEDCOM file, that file can contain names of image files that are either absolute or relative to directory containing GEDCOM file or some other directory. Program options can specify directory where images are located.
Input file is a ZIP archive that includes both GEDCOM file and files with images. Depending on how GEDCOM file and archive were prepared names of image files in GEDCOM file can be specified as absolute paths to their original location or relative paths to their common directory.
Additional issue to consider is that files can be prepared on a system
which is different from the system where the file is parsed. For example
GEDCOM file could be prepared on Windows machine and names of image files
could be given using Windows path convention (either absolute as
C:\Users\JosephSmith\Documents\Pictures\Family\Tree\Me.BMP
or relative as Pictures\Family\Tree\Me.BMP
) and later this GEDCOM
file could be copied to Linux host and processed using ged2doc
package.
Files on Linux machine will have different absolute and possibly relative
paths (and definitely different path separator character).
In case of ZIP archive the names of images in GEDCOM file could be different
from the names in in the archive (e.g. image path in GEDCOM file
C:\Users\JosephSmith\Documents\Pictures\Family\Tree\Me.BMP
could
be stored in ZIP archive as Pictures/Family/Tree/Me.BMP
).
Logic in this module is supposed to handle all those possible cases where names of files in GEDCOM file could be different from their location on a target storage system.
Typical use cases for GEDCOM file returned by this module is to be passed to
methods in ged4py
package and that package expects true
filesystem-backed file which supports seek()
and tell()
methods.
Image files do not typically need support for these methods and are usually
read as a byte stream using read()
method. This module returns seek-able
file object open in binary mode for GEDCOM file (meaning that temporary file
on disk may need to be created in some cases) and a “simple” binary stream
for images.
Functions
|
Create and return file locator instance |
Classes
Abstract interface for file locator instances. |
Exceptions
Class for exceptions generated when there is more than one file matching specified criteria. |
-
ged2doc.input.
make_file_locator
(input_file, file_name_pattern, image_path)[source]¶ Create and return file locator instance
For a given input file (which can be GEDCOM file or ZIP archive) return corresponding file locator object (instance of
FileLocator
type).- Parameters
- input_file
Path of the input file or file object, can be a ZIP archive or a GEDCOM file. If argument is a file object then it must support
seek()
method and be open in a binary mode.- file_name_pattern
str
If input file is a ZIP archive then this pattern is used to search for a GEDCOM file in archive. Could be
"*.ged"
for example or can include more specific pattern.- image_path
str
Directory on a filesystem where images are found. Images could be located in sub-directories of the given path. If
file_name
is a ZIP archive then images are searched inside ZIP archive and then inimage_path
. Ifimage_path
isNone
then filesystem is not searched for files. Ifimage_path
is an empty string then current directory is searched.
- Returns
- locator
FileLocator
File locator instance.
- locator
- Raises
- OSError
Raised if file is not found.
- AttributeError
Raised if file object is given as input file but it does not support
seek()
method.
-
class
ged2doc.input.
FileLocator
[source]¶ Bases:
object
Abstract interface for file locator instances.
Methods
Returns file object for the input GEDCOM file.
open_image
(name)Returns open file object for the named image file.
-
abstract
open_gedcom
()[source]¶ Returns file object for the input GEDCOM file.
If no GEDCOM file is found
None
is returned. If more than one file is found thanMultipleMatchesError
exception is raised. Can throw other exceptions, e.g. if file cannot be open.Returned file object will be open in binary mode and will support
seek()
andtell()
methods. Note that this may be a temporary file which will be deleted after file is closed.- Returns
- file
File object open in binary mode supporting
seek()
andtell()
methods.
- Raises
- MultipleMatchesError
Raised if more than one file file is found.
-
abstract
open_image
(name)[source]¶ Returns open file object for the named image file.
If image file is not found
None
is returned. If more than one matching file is found thanMultipleMatchesError
exception is raised. Can throw other exceptions if file cannot be open.Note that this file object may not support all operations (it may be an object inside zip archive for example) so you may need to copy it if you want full file protocol support.
- Parameters
- name
str
Name of the image file to open. This can be relative or absolute path name. Usually this is the name that is stored in GEDCOM file and it can use separator character which is different from a system reading this file.
- name
- Returns
- image
File object open in binary mode, only
read()
method is guaranteed to work.
- Raises
- MultipleMatchesError
Raised if more than one file is found.
-
_abc_impl
= <_abc_data object>¶
-
abstract