2021SC@SDUSC Yamada Zhiyun 2. Analysis of seafobj: 1

2021SC@SDUSC
First, let's list the seafobj Directory:

backends
utils
__init__.py
blocks.py
commits.py
commit_differ.py
db.py
exceptions.py
fs.py
objstore_factory.py

Open__ init __.py: see

from .commits import commit_mgr
from .fs import fs_mgr
from .blocks import block_mgr
from .commit_differ import CommitDiffer

So let's first analyze exception.py and objstore_factory,db.py

db.py

There are three methods

create_engine_from_conf(config)
init_db_session_class(config)
ping_connection(dbapi_connection, connection_record, connection_proxy)

According to the calling relationship, the analysis will be carried out in the following order

ping_connection

cursor = dbapi_connection.cursor()
    try:
        cursor.execute("SELECT 1")
        cursor.close()
    except:
        logging.info('fail to ping database server, disposing all cached connections')
        connection_proxy._pool.dispose() # pylint: disable=protected-access

        # Raise DisconnectionError so the pool would create a new connection
        raise DisconnectionError()

create_engine_from_conf(config)

need_connection_pool_fix = True

Initially, the variable will be set to True, and the variable will become False under the following conditions

 if not config.has_section('database'):
    ......
    need_connection_pool_fix = False

So need_connection_pool_fix is determined by whether the config has a database

Then look at the if branch

seafile_data_dir = os.environ['SEAFILE_CONF_DIR']
   if seafile_data_dir:
       path = os.path.join(seafile_data_dir, 'seafile.db')
   else:
       logging.warning('SEAFILE_CONF_DIR not set, can not load sqlite database.')
       return None
   db_url = "sqlite:///%s" % path

You can see that you will look for seafile first_ data_ Dir. If the variable does not exist, logging will warn. And return None

Then look at the else branch

backend = config.get('database', 'type')

Get the type of config first

if backend == 'mysql':
elif backend == 'oracle':
else:
    raise RuntimeError("Unknown database backend: %s" % backend)

It can be seen that these statements are mainly used to get the type of database. Take a look at the statements under mysql

if config.has_option('database', 'host'):
      host = config.get('database', 'host').lower()
else:
      host = 'localhost'

if config.has_option('database', 'port'):
      port = config.getint('database', 'port')
else:
      port = 3306
username = config.get('database', 'user')
passwd = config.get('database', 'password')
dbname = config.get('database', 'db_name')
db_url = "mysql+pymysql://%s:%s@%s:%s/%s?charset=utf8" % (username, quote_plus(passwd), host, port, dbname)

These are to get some config information.

The only difference between the statements under oracle and mysql is port.

Add pool recycle

kwargs = dict(pool_recycle=300, echo=False, echo_pool=False)
engine = create_engine(db_url, **kwargs)
    if need_connection_pool_fix and not has_event_listener(Pool, 'checkout', ping_connection):
        # We use has_event_listener to double check in case we call create_engine
        # multipe times in the same process.
        add_event_listener(Pool, 'checkout', ping_connection)
return engine

init_db_session_class method

Configure the Session class of mysql according to the configuration file

try:
   engine = create_engine_from_conf(config)
except configparser.NoOptionError as xxx_todo_changeme:
   configparser.NoSectionError = xxx_todo_changeme
   raise RuntimeError("invalid seafile config.")

The above create was called_ engine_ from_ Conf (config) method

 # reflect the tables
Base.prepare(engine, reflect=True)

Session = sessionmaker(bind=engine)
return Session

At this point, the db.py file has been analyzed. The function of this file is to configure the session file for the database according to config

exceptions.py

This file clearly declares a series of exception s about seafile

First declare a seafbjeexception (Exception) class, and the base class is the Exception class

class SeafObjException(Exception):
    def __init__(self, msg):
        Exception.__init__(self)
        self.msg = str(msg)

    def __str__(self):
        return self.msg

After that, all classes and base classes are seafobjeexception

class InvalidConfigError(SeafObjException):
#This Exception is rasied when error happens during parsing
 Parsing error exception
class ObjectFormatError(SeafObjException):
#This Exception is rasied when error happened during parse object
 An exception occurred while parsing the object
class GetObjectError(SeafObjException):
#This exception is raised when we failed to read object from backend.
Exception reading object from backend
class SwiftAuthenticateError(SeafObjException):
#This exception is raised when failed to authenticate for swift.
swfit Authentication failed
class SeafCryptoException(SeafObjException):
#This exception is raised when crypto realted operation failed.
Encryption related work failed

objstore_factory.py

Because of the reference statement, let's go to the filesystem file first

from seafobj.backends.filesystem import SeafObjStoreFS

At the same time. There are also reference statements in the filesystem file

from .base import AbstractObjStore

We continue to trace back to the base file

Base.py

You can see that the base file has only one class, AbstactObjStore

class AbstractObjStore(object):
    '''Base class of seafile object backend'''
    You can see that this class is seafile Back end base class for

The methods include

__init__
read_obj
read_obj_raw
get_name
list_objs
obj_exists
write_obj
stat
stat_raw
Supplementary knowledge: not implemented error
Method or function hasn't been implemented yet.

The following methods only throw this exception, and the rest are implemented in the following subclasses

read_obj_raw

Reads the original content of the object from the back end.

get_name

Gets the backend name for display in the log

list_objs

List all objects

obj_exists

given repo_id,obj_id,See if the object exists

write_obj

Write data to target backend

stat_raw

read_obj

call read_obj_raw method

Then there are two if statements

if self.crypto:
      data = self.crypto.dec_data(data)
if self.compressed and version == 1:
      data = zlib.decompress(data)

The whole method is wrapped in the try\catch block, and GetObjectError() will be thrown

stat

  if self.crypto or self.compressed:
            try:
                data = self.read_obj(repo_id, verison, obj_id)
                return len(data)
            except:
                raise
        return self.stat_raw(repo_id, obj_id)

FileSystem.py

The core code in the file system is class SeafObjStoreFS(AbstractObjStore). It can be seen that this object inherits the AbstractObjStore class in Base.py, so many class methods need to be defined

Look at objstore again_ factory.py

Structure of the file

get_ceph_conf(cfg,section):
get_ceph_conf_from_json(cfg):
get_s3_conf(cfg,section):
get_s3_conf_from_json(cfg):
get_oss_conf(cfg,section):
get_swift_conf(cfg,section):
get_swift_conf_from_json(cfg):
class SeafileConfig(object)
class SeafObjStoreFactory(object):
get_repo_storage_id(repo_id):

Start with the SeafObjStoreFactory class

SeafObjStoreFactory

obj_section_map = {
    'blocks': 'block_backend',
    'fs': 'fs_object_backend',
    'commits': 'commit_object_backend',
}

Then look at the init method

def __init__(self, cfg=None):
   self.seafile_cfg = cfg or SeafileConfig()
   self.json_cfg = None
   self.enable_storage_classes = False
   self.obj_stores = {'commits': {}, 'fs': {}, 'blocks': {}}  

Empty SeafileConfig class first

cfg = self.seafile_cfg.get_config_parser()
if cfg.has_option ('storage', 'enable_storage_classes'):

View get in SeafileConfig_ config_ Parser method

def get_config_parser(self):
    if self.cfg is None:
       self.cfg = configparser.ConfigParser()
       try:
          self.cfg.read(self.seafile_conf)
       except Exception as e:
          raise InvalidConfigError(str(e))
    return self.cfg

You can see that the configparser object is returned

enable_storage_classes = cfg.get('storage', 'enable_storage_classes')
            if enable_storage_classes.lower() == 'true':

Get the property from the configuration and view the property

from seafobj.db import init_db_session_class
self.enable_storage_classes = True
self.session = init_db_session_class(cfg)
try:
  json_file = cfg.get('storage', 'storage_classes_file')
  f = open(json_file)
  self.json_cfg = json.load(f)
except Exception:
  logging.warning('Failed to load json file')
  raise

Keep watching get_obj_stores() method

def get_obj_stores(self, obj_type):
try:
    if self.obj_stores[obj_type]:
        return self.obj_stores[obj_type]
except KeyError:
    raise RuntimeError('unknown obj_type ' + obj_type)

If obj__ If the type is wrong, an error is reported

The following code structure

You can see that different treatments are given for different types

In the specific processing, the methods are similar, and only the processing of fs is analyzed.

obj_dir = os.path.join(bend[obj_type]['dir'], 'storage', obj_type)
   self.obj_stores[obj_type][storage_id] = SeafObjStoreFS(compressed, obj_dir, crypto)

As you can see, create a new object and store it in obj_stores object.

Then analyze get_obj_store method

Return a SeafileObjStore Implementation of

The code structure is as follows, which is similar to the method structure above

You can see that other methods are SeafObjStoreFactory class services.

So far, we have analyzed the basic architecture of these three files, and will update this article after a deeper understanding of the project

Keywords: Database MySQL Oracle db

Added by Lucnet on Thu, 07 Oct 2021 21:41:26 +0300