In my previous blog post, I wrote an article about the design of database management platform. At that time, I did not mention one point in the design, that is, if my system should support more than N database types, and the table fields, case distinction and character length operation between each database on the market may be completely different
For example, in Oracle database, under utf8 coding, a Chinese character occupies 2 bytes, and 4000 length varchar2 can store 2000 Chinese characters, which is no problem. However, in domestic Dameng database, under utf8 coding, a Chinese character occupies 3 characters or 1 character after configuration. In this way, if data migration is not processed, errors will be reported
There is also the export and import function. If you simply use insert statements, there will certainly be no problem. Simple statements are universal, but millions of data will run a day with insert In oralce, there are loaders like dmpdp. In mysql and Damon databases, there are also different loaders for their respective databases
In this many to many situation, the code can easily become:
if("oralce".equals(dbtype)){ if("varchar2".equals(column)){ } }else if("dm".equals(dbtype)){ if("varchar2".equals(column)){ } }else if("mysql".equals(dbtype)){ if("varchar2".equals(column)){ } }else if("db2".equals(dbtype)){ if("varchar2".equals(column)){ } }else if("mongo".equals(dbtype)){ if("varchar2".equals(column)){ } }else if("redis".equals(dbtype)){ if("varchar2".equals(column)){ } }
As mentioned above, if there is more database type in the future, an else if needs to be added to this large if block. If the field comparison of the library is inconsistent, countless if in the inner layer may be added. If we think about it again, oracle can convert data to mysql and Damon database!!! Then it will explode. It needs to be nested again in the if to compare one by one. The one to many relationship needs to be rewritten one by one I believe many people have written this code, and an if block can be written in more than 5000 lines. If the new person takes over, he can basically prepare to write a resignation letter
So, the question is, how to optimize this structure? Is there any way?
Among the 23 common design modes of program design, one is called "strategy mode". We can see Baidu's explanation:
Strategy mode In policy mode( Strategy Pattern)The behavior of a class or its algorithm can be changed at run time. This type of design pattern belongs to behavioral pattern. In the policy pattern, we create objects that represent various policies and an object whose behavior changes with the change of policy objects context Object. Policy object change context Object. introduce Intent: define a series of algorithms,Package them one by one, And make them interchangeable. Main solution: when there are many similar algorithms, use if...else It is complex and difficult to maintain. When to use: a system has many classes, and what distinguishes them is their direct behavior. How to solve it: encapsulate these algorithms into classes one by one and replace them arbitrarily. Key code: implement the same interface. Application examples: 1. Zhuge Liang's knapsack, each knapsack is a strategy. 2. The way of travel, choose to ride a bike or take a car. Each way of travel is a strategy. 3, JAVA AWT Medium LayoutManager. Advantages: 1. The algorithm can be switched freely. 2. 2. Avoid using multiple conditional judgments. 3. Good scalability. Disadvantages: 1. The number of policy classes will increase. 2. All policy classes need to be exposed. Usage scenario: 1. If there are many classes in a system, and the only difference between them is their behavior, then using the policy pattern can dynamically make an object choose one behavior among many behaviors. 2. A system needs to dynamically choose one of several algorithms. 3. If an object has a lot of behaviors, without appropriate patterns, these behaviors have to be implemented by multiple conditional selection statements. Note: if a system has more than four policies, you need to consider using mixed mode to solve the problem of policy class expansion.
It looks like a lot and a little cloudy, but we can see that the strategy model mainly solves the problems of difficult maintenance and complexity caused by if else,
In the last line, baidu also has a sentence: if there are more than four strategies in a system, you need to consider using the mixed mode to solve the problem of strategy expansion!
Think about it. We can't have less than four database types Therefore, a simple strategy pattern may not be enough for us. Here I found another design pattern that can be used for matching: factory pattern
I believe you are familiar with the factory model There are too many places to use, so I won't introduce it. Let's see what benefits will be gained if the strategic model and factory model are integrated together?
If we simply use the policy pattern, we will produce a large number of policy classes. Every time we use it, we need new policy class to implement it, which can not achieve the elegant effect we imagined After combining the factory pattern, we generate the policy class by the factory. We only need to focus on implementing our internal processing logic
Come on, start coding (just write some simple solutions for yourself, don't spray them):
First, we create our own factory class for production strategy:
package com.zach.factory; import java.util.HashMap; import java.util.Map; /** * Function Description: * 〈Strategic factory * * @return : * @author : zach */ public class EtlDbFactory { public static final String MYSQL = "MYSQL"; public static final String ORACLE = "ORACLE"; public static final String VERTICA = "VERTICA"; public static final String DM = "DM"; public static final String SQLSERVER = "SQLSERVER"; public static final String SYBASE = "SYBASE"; public static final String DB2 = "DB2"; public static final String POSTGRESQL = "POSTGRESQL"; public static final String GBASE = "GBASE"; private static EtlDbFactory factory = new EtlDbFactory(); private EtlDbFactory(){} private static Map<String,EtlMakeDbEinvoice> etlMakeDbEinvoiceMap = new HashMap<>(); static { etlMakeDbEinvoiceMap.put(MYSQL,new MysqlEtlService()); etlMakeDbEinvoiceMap.put(ORACLE,new OracleEtlService()); etlMakeDbEinvoiceMap.put(VERTICA,new VerticaEtlService()); etlMakeDbEinvoiceMap.put(DM,new DmEtlService()); etlMakeDbEinvoiceMap.put(SQLSERVER,new SqlServerEtlService()); etlMakeDbEinvoiceMap.put(SYBASE,new SybaseEtlService()); etlMakeDbEinvoiceMap.put(DB2,new DB2EtlService()); etlMakeDbEinvoiceMap.put(POSTGRESQL,new PostgreSqlEtlService()); etlMakeDbEinvoiceMap.put(GBASE,new GbaseEtlService()); } public EtlMakeDbEinvoice creator(String type){ return etlMakeDbEinvoiceMap.get(type); } public static EtlDbFactory getInstance(){ return factory; } }
In this class, we introduce some common database types. In the data exchange business between databases, we usually encounter the situation of source database and target database. The factory class here is only for the encapsulation of source database
Then define our policy distribution interface:
package com.zach.factory; import java.util.Map; /** * Function Description: * 〈Policy factory mode distribution processing etl multiple dbType types * * @return : * @author : zach */ public interface EtlMakeDbEinvoice { //Data loading interface boolean makeDbLoading(Map<String,Object> paramMap); //Table creation statement loading interface boolean createDbTable(Map<String,Object> paramMap); }
Because this function only completes the collection of data from the source library for a while, and then loads various loaders to load different target libraries after data processing. Therefore, there are two temporary writing methods. Later, the demand can be increased.
When the above two classes are completed, our policy factory and distribution interface are completed. For the rest, we need to create different implementation classes according to different libraries to integrate the distribution interface, and then remove one ifelse for each database type
For example, we write an implementation of mysql as the source library:
package com.zach.factory; import com.zach.domain.ReturnMsgInfo; import com.zach.timeTask.etlLogsService; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.stereotype.Component; import java.util.Map; @Component public class MysqlEtlService extends CommonLoadingService implements EtlMakeDbEinvoice { Logger logger = LoggerFactory.getLogger(MysqlEtlService.class); private ReturnMsgInfo returnMsgInfo; @Override public boolean makeDbLoading(Map<String, Object> paramMap) { InitData(paramMap); returnMsgInfo = super.switchDb(); //Start writing to the log according to the return value PubLogService(paramMap,returnMsgInfo); return true; } @Override public boolean createDbTable(Map<String, Object> paramMap) { InitData(paramMap); //Call the public table creation statement column data generation method returnMsgInfo = CommonCreateTableService.createTableStr(paramMap); if(0 == returnMsgInfo.getCode()){ returnMsgInfo = super.switchDb(); //Start writing to the log according to the return value PubLogService(paramMap,returnMsgInfo); }else{ paramMap.put("log",returnMsgInfo.getMsg()); etlLogsService.startTaskLogSuccess(paramMap);//Modify task table etlLogsService.taskItemLogAndPath(paramMap);//Log table + location where the load error log is recorded } return true; } }
So how do we call specific implementation methods?
We can call some of our predetermined configurations or query from the database. What type of library is our current source library? Then, the type is transferred to our factory class, and then distributed by our policy factory to the implementation class of each source library type, such as:
public static void insertExec(Map<String,Object> paramMap){ Map<String,Object> rtnMap = new HashMap<>(); //Policy factory processing source data type String SourceDbType = DictUtil.returnDbTypeDBTECHNOLOGYCLASSID(((etlDsaDbVO) paramMap.get("etlDsaDbVOSource")).getDbServerType());//Source table database type EtlMakeDbEinvoice etlMakeDbEinvoice = EtlDbFactory.getInstance().creator(SourceDbType); etlMakeDbEinvoice.makeDbLoading(paramMap); }
In the above code, I obtained the SourceDbType, that is, my source library type by querying the database, and then parsed it through the factory class EtlDbFactory. When we call makeDbLoading, it will be automatically distributed to our MysqlEtlService class for processing
In this way, we have solved the problem of ifelse in the first layer. What about the second layer? We should not forget that there are seven or eight source libraries. Correspondingly, our source library must also be compared with the target library in 6-7. The loaders of each target library are different. We do not have a general loader to realize different data types
Since we have used the form of policy factory to optimize the code, it is impossible to say that we write a large number of policies in the implementation class of each source library:
if("mysql".equals(Target library) && "mysql".equals(Source library)){ use mysql Loader }else if("Oralce".equals(Target library) && "mysql".equals(Source library)){ use Oracle Loader }else if(){ XXXXXX }else if(){ XXXXXX }else if(){ XXXXXX }else if(){ XXXXXX }
This is equivalent to not achieving our goal, but if we use a layer of policy factory to initialize the target library at this time? I think it's too complicated It's not worth the loss Therefore, I chose a compromise encapsulation scheme, because all loaders have a data format that can be defined. For example, the loader of Damon database requires one row of data, each column is separated by |, and multi row data uses @ @ to distinguish rows. In all loaders, this separator format can be customized by us, Therefore, we can completely encapsulate all data acquisition and only deal with the loading problem,
In this way, it is equivalent to writing each implementation class in the previous implementation classes of mysql,oralce, Damon and other databases, and compressing the query and loading for each database into a public method. When the loader rules are variable, only the public processing methods need to be implemented. If the target library does not support custom separators, you can rewrite them yourself, Save a lot of code,
For example:
@Override public boolean makeDbLoading(Map<String, Object> paramMap) { InitData(paramMap); returnMsgInfo = super.switchDb(); //Start writing to the log according to the return value PubLogService(paramMap,returnMsgInfo); return true; }
public void InitData(Map<String, Object> paramMap){ paramMapData = paramMap; etlList = (etlTaskConfigVO) paramMap.get("etlTaskConfigVO");//etl_starttasklog/etl_startconfig/etl_data_extract_config configuration information etlDsaDbVOSource = (etlDsaDbVO) paramMap.get("etlDsaDbVOSource");// etl_ data_ extract_ Config / ETL DSA info / ETL DB info source database information etlDsaDbVOTarget = (etlDsaDbVO) paramMap.get("etlDsaDbVOTarget");// etl_ data_ extract_ Config / ETL DSA info / ETL DB info target database information execSql = SQLUtil.trimEnterAndSpace(SQLUtil.removeCommnetFromSQL(etlList.getStartconfigExtractcode()));//Take out the comments, spaces, line breaks, leading spaces, etc. in the original sql TargetDbType = DictUtil.returnDbTypeDBTECHNOLOGYCLASSID(((etlDsaDbVO) paramMap.get("etlDsaDbVOTarget")).getDbTechnologyClassId());//Target table database type TargetTableName = etlList.getStartconfigTargettablename().toUpperCase();//Uppercase target table name fileExtractDirPath = ETLGlobal.getfileExtractDirPath() + etlList.getTasklogCode();//Initial log / data file storage path fileExtractBakDir = ETLGlobal.getFileExtractBakDir() + etlList.getTasklogCode();//Error log backup directory storage path FileUtils.createFolder(fileExtractDirPath);//Create data file storage path } public ReturnMsgInfo switchDb(){ switch (TargetDbType.toUpperCase()){ case "MYSQL": returnMsgInfo = MYSQL(); break; case "ORACLE": returnMsgInfo = ORACLE(); break; case "DM": returnMsgInfo = DM(); break; default: logger.error("The target database type cannot be matched. Please check the type: " + TargetTableName.toUpperCase()); returnMsgInfo.error(); break; } return returnMsgInfo; }
/** * Function Description: * 〈Processing when the target database is mysql * * @return : com.zach.domain.ReturnMsgInfo * @author : zach */ public ReturnMsgInfo MYSQL(){ logger.info("The entry target data is mysql situation"); returnMsgInfo = new ReturnMsgInfo(); returnMsgInfo.success("Task execution completed,Executive sql Statement is:" + execSql); try(Connection conn = DBConnection.getConnection(etlDsaDbVOSource.getDbDriver(),etlDsaDbVOSource.getDbUrl(),etlDsaDbVOSource.getDsaInfoUsername(),etlDsaDbVOSource.getDsaInfoPassword()); Statement stat = conn.createStatement(); ResultSet rs = stat.executeQuery(execSql)) { Map<String,Object> rtnMap = etlExecMYSQLToolService.createDataFile(TargetTableName,rs,fileExtractDirPath);//Generate data file if("0".equals(rtnMap.get("code").toString())){ String dataFilePath = fileExtractDirPath + File.separator + TargetTableName + ".txt"; String batStr = "LOAD DATA LOCAL INFILE '" + dataFilePath + "' INTO TABLE " + TargetTableName + " FIELDS TERMINATED BY ','OPTIONALLY ENCLOSED BY '\"'LINES TERMINATED BY '\r\n'".replaceAll("\\\\","/"); //Execute command statements (mysql LOADDATA statements are executed through mysql instruction statements, not bat or sh command statements and components) stat.executeQuery(batStr); }else{ //Data file generation failed, recording error returnMsgInfo.error("Data file generation failed"); } return returnMsgInfo; }catch (Exception e){ e.printStackTrace(); returnMsgInfo.error("Task execution failed,Executive sql Statement is:" + execSql + "The reasons for the failure are:" + e.getMessage()); return returnMsgInfo; } }
This minimizes code repeatability
Finally, I feel that the use of design patterns can bring magical changes and upgrades to our code and design in some cases, but the introduction of design patterns must be introduced when we determine the requirements and do not introduce them when they will have a disgusting impact on code readability or performance. They will be very easy to use and need to be reviewed
I can't say that I think this place may use design patterns. Without saying a word, I directly go to the design patterns and look at the high-end, but it may do the opposite, greatly reducing the code readability