Skip to main content

Posts

Showing posts from October, 2023

Tried to open a very big CSV file in excel

Have you tried opening a very big csv file with excel?  I tried to open a csv file with 5million rows of data using Microsoft Excel : 1,048,576 rows limit . 1 million row limit ,I was aware of this. Not all the data was loaded , even got a notification modal stating this. 32,767 character per cell , this I was not aware of. After opening a file which exceed this limit , new rows were displayed and looked like a mess. But the file was properly formatted when opened with notepad. This one is obvious but formulas and filters were very slow (given the size of the data ,expected). Why I was required to open a file with over 5million rows in the first place ? I was actively trying to learn machine learning and tried to build dataset for supervised learning. I wanted to open the file to mark classes and values - for training classification and regression models. Workarounds I did include : filtered and removed currently unused rows . This cut the size by almost half. split the files into smal

Python literal_eval - convert stringed list into python list

 Python ast.literal_eval()  Is a very useful method which can evaluate a string contents for python datatypes . If the string happens to have valid datatype inside the string it will initialised . One use case in which I personally used it was to convert strings values stored csv files . The rows contained "list" as string and wanted to run operations on it. These must be converted into list format first ; ast.literal_eval()  achieves that ! #python code import ast #string containing a list stringed_list= "[1,2,3]" #converting stringed list into python list converted_list = ast.ast.literal_eval(string_list) print(type(stringed_list)) # 'str' print(type(converted_list)) # 'list' Even though this is a really convenient way to convert stringed list back into python list ,  it is slow. This works great for moderately small csv data files in which we can store scaled parameter lists , etc as string .  When size of the file increases so does

Topics

Show more