Phoenix duplicate record -- the reason and solution of duplicate query data

Problem description

issue A: after turning on the parameter (phoenix.stats.enabled=true), use Phoenix SQL to query the data, and there are duplicates (the data found is more than the actual content stored in HBase) issue B: after the parameter is closed (phoenix.stats.enabled=false), Phoenix SQL performance decreases.

Environmental Science

Phoenix version: phoenix-4.8.0-HBase-1.1

Purpose of this paper

Explore the influence of stats on query

Parameter description

phoenix.stats.enabled: whether statistics is enabled (the default is true).

Parameter function

When stats is turned on, major action and region split will automatically call the updateStatistic method of statisticscolollector, collect the key information of Region, calculate the guideposts and write them to the system.stats table.

Parameter influence (parallelism)

Phoenix SQL improves performance by dividing queries into more scan and executing scan in parallel. The data between guide posts will be treated as a chunk, each chunk corresponds to an scan, and the query performance can be improved by executing scan in parallel. The chunk size can be configured through phoenix.stats.guidepost.width. Smaller chunks mean more scan & more concurrency, and more chunks need to be merged by clients.

Guide posts related SQL

Set guide? Posts? Width

ALTER TABLE my_table SET GUIDE_POSTS_WIDTH = 10000000 ALTER TABLE my_table SET GUIDE_POSTS_WIDTH = null

Recalculate guideposts

UPDATE STATISTICS my_table

View guideposts

select * from system.stats where physical_name='my_table' ;

Performance improvement of guidepost

As we can see from the above, guidepos can divide the data of region into smaller blocks to generate more scan. This behavior can be observed through explain SQL.

Do not use guidepost

Clear guideposts: delete from system. Stats where physical menu name ='db. Table '; Execute explain: explain select * from XXX where XXX > 'xxx'; Return: CLIENT 2-CHUNK PARALLEL 2-WAY ROUND ROBIN RANGE SCAN OVER

Using guidepost

Generate guideposts: update statistics DB.TABLE all; View the generated guideposts: select * from system.stats where physical_name='DB.TABLE '; View execution plan: explain select * from XXX where XXX > 'xxx'; Return: client 10-chunk XXX rows XXX bytes parallel 2-way round range scan over

It can be found that with guideposts enabled, more scan s are actually generated.

dive deep into code

Where to come from and where to go
Tracking link

DefaultStatisticsCollector -> updateStatistic -> commitStats -> StatisticsWriter -> addStats -> addGuidepost -> addGuidepost

        // tableName = SYSTEM_STATS_NAME(system.stats)
        byte[] prefix = StatisticsUtil.getRowKey(tableName, cfKey, ptr);
        Put put = new Put(prefix);
        put.add(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES, PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES,
                timeStamp, PLong.INSTANCE.toBytes(byteCount));
        put.add(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES,
                PhoenixDatabaseMetaData.GUIDE_POSTS_ROW_COUNT_BYTES, timeStamp,
                PLong.INSTANCE.toBytes(rowCount));
        // Add our empty column value so queries behave correctly
        put.add(QueryConstants.DEFAULT_COLUMN_FAMILY_BYTES, QueryConstants.EMPTY_COLUMN_BYTES, timeStamp,
                ByteUtil.EMPTY_BYTE_ARRAY);
        mutations.add(put);
Why can more scan be generated
Tracking link

PhoenixStatement.executeQuery -> BaseQueryPlain.iterator -> ScanPlan.newIterator -> ParallelIterators -> BaseResultIterators.getParallelScans

        int gpsSize = gps.getGuidePostsCount();
        int estGuidepostsPerRegion = gpsSize == 0 ? 1 : gpsSize / regionLocations.size() + 1;
        int keyOffset = 0;
        ImmutableBytesWritable currentGuidePost = ByteUtil.EMPTY_IMMUTABLE_BYTE_ARRAY;
        List<Scan> scans = Lists.newArrayListWithExpectedSize(estGuidepostsPerRegion);

In the code, we need to pay attention to regionLocations instead of regions. That is, the number of nodes in the region is counted, not the number of regions. From the above code, you can see that guidepost helps Phoenix generate more scan.

Why duplicate records

Explanation: duplicate records -- SQL queries out "more" data, which does not exist in HBase.

This is a hidden logic. First of all, from the conclusion, if the region at the end of some tables is smaller than guide.post.width (the default is 104857600, or 100M), then the guide post of this region will not be generated. When querying, the parallel scan is scanned from the last guidepost (not the region's startkey), resulting in duplicate scan generation for the region.

    while (regionIndex <= stopIndex) {
		......
		try {
			while (guideIndex < gpsSize && (endKey.length == 0 || currentGuidePost.compareTo(endKey) <= 0)) {
				Scan newScan = scanRanges.intersectScan(scan, currentKeyBytes, currentGuidePostBytes, keyOffset,
						false);
				......
				scans = addNewScan(parallelScans, scans, newScan, currentGuidePostBytes, false, regionLocation);
				currentKeyBytes = currentGuidePostBytes;
				currentGuidePost = PrefixByteCodec.decode(decoder, input);
				currentGuidePostBytes = currentGuidePost.copyBytes();
				guideIndex++;
			}
	    } catch (EOFException e) {}
		Scan newScan = scanRanges.intersectScan(scan, currentKeyBytes, endKey, keyOffset, true);
		if(newScan != null) {
			ScanUtil.setLocalIndexAttributes(newScan, keyOffset, regionInfo.getStartKey(),
				regionInfo.getEndKey(), newScan.getStartRow(), newScan.getStopRow());
		}
		scans = addNewScan(parallelScans, scans, newScan, endKey, true, regionLocation);
		currentKeyBytes = endKey;
		regionIndex++;
	}

When traversing to the last region, if there is no guideposts in the region, scan may start from the previous guideposts, causing the previous scan and this scan to scan repeatedly.

Judge whether the guidepost is written
if (byteCount >= guidepostDepth) {
                ImmutableBytesWritable row = new ImmutableBytesWritable(kv.getRowArray(), kv.getRowOffset(), kv.getRowLength());
                if (gps.getSecond().addGuidePosts(row, byteCount, gps.getSecond().getRowCount())) {
                    gps.setFirst(0l);
                    gps.getSecond().resetRowCount();
                }
            }

guidepostDepth calculation

    public static long getGuidePostDepth(int guidepostPerRegion, long guidepostWidth, HTableDescriptor tableDesc) {
        if (guidepostPerRegion > 0) {
            long maxFileSize = HConstants.DEFAULT_MAX_FILE_SIZE;
            if (tableDesc != null) {
                long tableMaxFileSize = tableDesc.getMaxFileSize();
                if (tableMaxFileSize >= 0) {
                    maxFileSize = tableMaxFileSize;
                }
            }
            return maxFileSize / guidepostPerRegion;
        } else {
            return guidepostWidth;
        }
    }

duplicate records repair

Fix in version 4.12: https://issues.apache.org/jira/browse/PHOENIX-4007

Solution Upgrade Phoenix to 4.12

Reference document

phoenix-statistics [1]: https://phoenix.apache.org/update_statistics.html [2]: https://github.com/apache/phoenix/blob/4.14-HBase-1.1/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/DefaultStatisticsCollector.java [3]: https://github.com/apache/phoenix/blob/4.14-HBase-1.1/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsWriter.java [4]:https://github.com/apache/phoenix/blob/4.14-HBase-1.1/phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java

Keywords: Big Data Apache SQL HBase Java

Added by evan18h on Fri, 06 Dec 2019 21:55:32 +0200