The statistical software Stata allows short text snippets to be saved within a dataset. This is accomplished either using notes and/or characteristics.
This is a feature of great value to me as it allows me to save a variety of information, ranging from reminders and to-do lists to information about how I generated the data, or even what the estimation method for a particular variable was.
I am now trying to come up with similar functionality in Python 3.6. So far, I have looked online and consulted several posts, which however do not exactly address what I want to do.
For a small NumPy array, I have concluded that a combination of the function numpy.savez() and a dictionary can store adequately all relevant information in a single file.
For example:
a = np.array([[2,4],[6,8],[10,12]])
d = {"first": 1, "second": "two", "third": 3}
np.savez(whatever_name.npz, a=a, d=d)
data = np.load(whatever_name.npz)
arr = data['a']
dic = data['d'].tolist()
However, the question remains:
Are there better ways to potentially incorporate other pieces of information in a file containing a NumPy array or a (large) Pandas DataFrame?
I am particularly interested in hearing about the particular pros and cons of any suggestions you may have with examples. The fewer dependencies, the better.