0 votes
1 view
in Python by (8.1k points)

I am creating a data acquisition device that retrieves sensor data (from an API) every 5 minutes and saves it in CSV files (exported every 24h to a database) and I would like to decrease the size of these files by only saving the data when the value changes.

My idea is to save all the data in a "memory" CSV file (which will be deleted at the end of the day) and to compare the last X lines (df1 -> T1) with the new dataframe (df2 -> T2) and to create the dataframe (df3 -> T2) without the lines where the values remain the same. This df3 will be written in another CSV which will be exported to the database at the end of the day.

Is this the right way to proceed ?

How to compare two dataframes of the same size and create a 3rd dataframe without the rows where the value does not change ?

df1 

   Time   Name  Value

0   t1  Name1      3

1   t1  Name2      1

2   t1  Name3      5

3   t1  Name4      9 

df2 

   Time   Name  Value

0   t2  Name1      3

1   t2  Name2      7

2   t2  Name3      5

3   t2  Name4      2 

df3 

   Time   Name  Value

0   t2  Name2      7

1   t2  Name4      2

1 Answer

0 votes
by (15.8k points)

Since you wish to create a third dataframe using the first two dataframes.

You can do it like this: df3 = df2[df2['value'] != df1['value']]

...