Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (16.4k points)
closed by

I utilized influxDB-Python to embed a lot of information read from the Redis-Stream. Since Redis-stream and set maxlen=600 and the information is embedded at a speed of 100ms, and I expected to hold the entirety of its information. so I read and move it to influxDB(I don't have the foggiest idea what's a superior data set), yet utilizing batch inserts just ⌈count/batch_size⌉ bits of information, both toward the finish of each batch_size, seem, by all accounts, to be overwritten.

Following code:

import redis

from apscheduler.schedulers.blocking import BlockingScheduler

import time

import datetime

import os

import struct

from influxdb import InfluxDBClient

def parse(datas):

    ts,data = datas

    w_json = {

    "measurement": 'sensor1',

    "fields": {

        "Value":data[b'Value'].decode('utf-8')

        "Count":data[b'Count'].decode('utf-8')

        }

    }

    return w_json

def archived_data(rs,client):

    results= rs.xreadgroup('group1', 'test', {'test1': ">"}, count=600)

    if(len(results)!=0):

        print("len(results[0][1]) = ",len(results[0][1]))

        datas = list(map(parse,results[0][1]))

        client.write_points(datas,batch_size=300)

        print('insert success')

    else:

        print("No new data is generated")

if __name__=="__main__":

    try:

        rs = redis.Redis(host="localhost", port=6379, db=0)

        rs.xgroup_destroy("test1", "group1")

        rs.xgroup_create('test1','group1','0-0')

    except Exception as e:

        print("error = ",e)

    try:

        client = InfluxDBClient(host="localhost", port=8086,database='test')

    except Exception as e:

        print("error = ", e)

    try:

        sched = BlockingScheduler()

        sched.add_job(test1, 'interval', seconds=60,args=[rs,client])

        sched.start()

    except Exception as e:

        print(e)

Data Changes accompanying the influxDB

> select count(*) from sensor1;

name: sensor1

time count_Count count_Value

---- ----------- -----------

0    6           6

> select count(*) from sensor1;

name: sensor1

time count_Count count_Value

---- ----------- -----------

0    8           8

> select Count from sensor1;

name: sensor1

time                Count

----                -----

1594099736722564482 00000310

1594099737463373188 00000610

1594099795941527728 00000910

1594099796752396784 00001193

1594099854366369551 00001493

1594099855120826270 00001777

1594099913596094653 00002077

1594099914196135122 00002361

For what reason does the information give off an impression of being overwritten, and How might I settle it to insert all the information at a time? 

I would value it in the event that you could disclose to me how to settle it?

closed

4 Answers

0 votes
by (15.4k points)
selected by
 
Best answer
You are encountering an issue where data seems to be overwritten when inserting it into InfluxDB using batch inserts. This occurs while reading data from a Redis-Stream and attempting to move it to InfluxDB for long-term storage. You have observed that only a portion of the data is successfully inserted, with the rest appearing to be overwritten.

To address this issue and ensure that all the data is inserted into InfluxDB without being overwritten, you can consider the following suggestions:

Validate Redis-Stream data: Double-check the Redis-Stream data you are reading to ensure that it doesn't contain duplicates or repeated entries. Verifying the uniqueness of the data will help prevent unintentional overwriting during insertion.

Adjust batch size: Experiment with different batch sizes when using the write_points function. You currently have a batch size of 300, but it may not be optimal for your specific data. Try increasing the batch size or even removing it to insert all the data at once, ensuring that the batch size aligns with your system's capabilities.

Handle data parsing errors: Implement error handling mechanisms to gracefully handle any errors that may occur during data parsing or decoding. This will prevent exceptions from disrupting the data insertion process and help ensure the successful transfer of all data.

Verify data retention policy: Verify the retention policy settings in your InfluxDB configuration. It is possible that the overwritten data is a result of retention policy rules. Adjust the retention policy if necessary to retain the desired data for your purposes.

By taking these steps and carefully reviewing your code, adjusting the batch size, addressing any data parsing errors, and verifying the retention policy, you should be able to overcome the issue of data being overwritten during insertion into InfluxDB.
0 votes
by (26.4k points)

In Influxdb, timestamp + tags are one of a kind (for example two information points with the same label values and timestamp can't exist). Dissimilar to SQL influxdb doesn't toss interesting limitation infringement, it overwrites the current information with the approaching information. It appears to be your information doesn't have labels, so if some approaching information whose timestamps are now present in the influxdb will override the current information

Are you pretty much interested to learn python in detail? Come and join the python training course to gain more knowledge.

For more details, do check out the below video tutorial...

0 votes
by (25.7k points)
The issue you are facing with overwriting data in InfluxDB while using batch inserts could be due to the way you are handling the Redis-Stream data and the batch size in your code. Here are a few suggestions to resolve the issue:

Check the Redis-Stream data: Make sure that the Redis-Stream data you are reading is unique and not duplicating the previously read data. You can add some logging or print statements to verify the data being processed.

Adjust batch size: Experiment with different batch sizes in the write_points function. You have set the batch size to 300, but it may not be suitable for your specific data and use case. Try increasing the batch size or even removing it altogether to insert all the data at once. Keep in mind the limitations of InfluxDB and the available system resources.

Handle data parsing errors: Add error handling mechanisms to handle any potential errors that may occur during data parsing or decoding. This will ensure that any exceptions do not interrupt the data insertion process.

Verify data retention policy: Check the retention policy settings in your InfluxDB configuration. It is possible that data is being overwritten due to retention policy rules. Adjust the retention policy if necessary to retain all the data you want.

By reviewing these aspects of your code and adjusting the batch size and data handling, you should be able to resolve the issue of overwritten data in InfluxDB.
0 votes
by (19k points)
You're facing an issue where data appears to be overwritten during insertion into InfluxDB using batch inserts. You're reading data from a Redis-Stream and transferring it to InfluxDB, but only a portion of the data is successfully inserted while the rest is overwritten. To resolve this, validate the Redis-Stream data, adjust the batch size, handle data parsing errors, and verify the retention policy in InfluxDB.

Browse Categories

...