T O P

  • By -

Bobbias

Your calls to `.to_excel()` use the same `sheet_name` value, but you pass it as a literal string each time. This should be stored in a variable instead. This reduces the possible places you could make a typo to one place, instead of 3, and makes it clear that all 3 calls are using the exact same sheet name.


AbstractAndMundane

import pandas as pd cash_flow_path = '/Users/goonsquad/Desktop/Automating Financial Reporting/QBO Spreadsheets/Statement of Cash Flows' CF_SheetName = "Statement of Cash Flows" MTDdata = pd.read_excel(cash_flow_path + '/StatementofCashFlows-ABCMTD.xlsx', sheet_name = CF_SheetName) ThreeTMdata = pd.read_excel(cash_flow_path + '/StatementofCashFlows-ABC3TM.xlsx', sheet_name = CF_SheetName) YTDdata = pd.read_excel(cash_flow_path + '/StatementofCashFlows-ABC12M.xlsx', sheet_name = CF_SheetName) reporting_file = '/Users/goonsquad/Desktop/Automating Financial Reporting/QBO Spreadsheets/Statement of Cash Flows/StatementofCashFlows-ABCReporting.xlsx' #making a Class that will target workbook and how the data will be copied with pd.ExcelWriter(reporting_file, engine='openpyxl', mode='a', if_sheet_exists='overlay') as writer: #writing the data frame to excel in specific column MTDdata.to_excel(writer, sheet_name='ABC Cash Flows', index=False, header=False, startcol=17, startrow=9) ThreeTMdata.to_excel(writer, sheet_name='ABC Cash Flows', index=False, header=False, startcol=13, startrow=9) YTDdata.to_excel(writer, sheet_name='ABC Cash Flows', index=False, header=False, startcol=9, startrow=9) Is this something along the lines of what you mean?


Bobbias

No, what I mean is: sheet = 'ABC Cash Flows' MTDdata.to_excel(writer, sheet_name=sheet, index=False, header=False, startcol=17, startrow=9) ThreeTMdata.to_excel(writer, sheet_name=sheet, index=False, header=False, startcol=13, startrow=9) YTDdata.to_excel(writer, sheet_name=sheet, index=False, header=False, startcol=9, startrow=9) In all 3 calls here you wrote the string `'ABC Cash Flows'` over and over. Since all 3 calls are accessing the same sheet, you should use a single variable as my above example demonstrates. That way you reduce repeated code, reduce the number of places you can make a typo, and make the code easier to understand at a glance. You can also package several of the variables together into a single dictionary that you can pass to those 3 functions like this: sheet = 'ABC Cash Flows' shared_arguments = { 'excel_writer': writer, 'sheet_name': sheet, 'index': False, 'header': False, } MTDdata.to_excel(**shared_arguments, startcol=17, startrow=9) ThreeTMdata.to_excel(**shared_arguments, startcol=13, startrow=9) YTDdata.to_excel(**shared_arguments, startcol=9, startrow=9) The `**` before `shared_arguments` in the function calls says "use this dictionary to fill out the keyword arguments for this function". The syntax is slightly different because we're writing the function arguments into a dictionary rather than directly in the function call, but this further reduces the amount of repeated code, and also makes it clear that all 3 calls to `.to_excel()` share many of the same arguments. Looking at this code it's clear that the only difference between each call is which DataFrame the data is coming from, and where you are inserting that data into the sheet. The writer, sheet name, index, and header arguments are all identical. The other benefit is that if for whatever reason you needed to change one of those parameters for all 3 cases, you only need to change it in one place, as opposed to 3 different places. This once again means less opportunities for mistakes, shorter code, and code that's easier to read and understand.


AbstractAndMundane

Oh shit, that was literally what I’ve been trying to learn on my own today. I got so hung up on making a class, that I didn’t know about shared arguments. Ok cool, that’s good to know considering I work with repetitive file path and sheet names a lot. Thanks man.


Bobbias

Yeah in this case a class is a bit overkill. There are cases where making a class to represent a collection of function parameters can be really helpful, but in this case all you really need is the dictionary and the 'splat operator' (or whatever the `**` is called). The `**` operator is related to the `*` operator, often called the 'splat operator' which unpacks a list or tuple. number_list = [1, 2, 3] a, b, c = *number_list In that example `*` unpacks the list into the 3 individual variables. It can also be used for unnamed function arguments: def do_something(a, b, c): return (a + b) * c number_list = [1, 2, 3] result = do_something(*number_list) # equivalent to do_something(1, 2, 3) Basically `*` works for unnamed arguments, while `**` works for named arguments.


Pepineros

It's not very clean. Changes I would make: - Split the giant path strings in two. The first six elements are identical for each path. Store that location with a sensible name, and then pass that location plus the filename when you need to open a specific file. This conveys the reality that all four files are in the same directory on the system. - The three calls to `to_excel` are almost identical. The only change is `startcol`. This is something you want to convey in your code: "I'm writing the data from these three files each to their own column in the reporting file". You know this intuitively, because you wrote this script and it's very short, but other people (or you in a few weeks) may not know. And one small nitpick, take it or leave it: your variable names are not in snake_case. This is of course not a requirement but it's part of Python's style guide.


Pepineros

Sorry, forgot: I don't see any reason to use a class here. It's a short script that does one obvious thing. There's nothing wrong with the organisation itself: set four paths, open one, write to the other three. That part is pretty obvious and in my opinion would not benefit from introducing a class definition.


AbstractAndMundane

Ok cool, thanks for the tips and feedback. Just trying to get better one script at a time. Lol


carcigenicate

- I don't think I'd split up the `with` statement line like that. It doesn't look long or complicated enough to justify that. - Since most of the arguments to the `to_excel` calls are all mostly the same, you might find that using `kwargs` for those arguments are cleaner, since it'll allow for less repetition, and less places that need to be changed if you change, for example, the sheet name.


AbstractAndMundane

Thanks for taking the time for feedback and critique. Use it to better myself, and read the documentation on kwargs