ficpa_article

所属分类:matlab编程
开发工具:Python
文件大小:0KB
下载次数:0
上传日期:2017-08-28 14:04:28
上 传 者sh-1993
说明:  ficpa.org文章“效率编程”的示例代码和源文件
(example code and source files for ficpa.org article "Programming for Efficiency")

文件列表:
example_1_tb.py (1307, 2017-08-28)
example_2_ficpa_testimonials.py (690, 2017-08-28)
example_3_pdf_scrape.py (764, 2017-08-28)
excel/ (0, 2017-08-28)
excel/pbc_tb.xlsx (10133, 2017-08-28)
excel/tb_import.xlsx (6279, 2017-08-28)
images/ (0, 2017-08-28)
images/example_1/ (0, 2017-08-28)
images/example_1/output.png (10159, 2017-08-28)
images/example_1/pbc_tb.png (19098, 2017-08-28)
images/example_3/ (0, 2017-08-28)
images/example_3/pdf_table.png (390363, 2017-08-28)
images/example_3/pdf_table_csv.png (121978, 2017-08-28)
pdf/ (0, 2017-08-28)
pdf/salary-information-for-the-executive-branch.pdf (1036977, 2017-08-28)
pdf/salary_info.csv (2137, 2017-08-28)

# Programming for Efficiency This repository includes example Python code from an article written for FICPA called Programming for Efficiency. The examples were written in Python 3.6, and require the following libraries to be installed:
  • requests
  • beautiful soup
  • openpyxl
  • pandas
  • pdfplumber
## Example 1: Using Excel to prep a PBC TB for import There are several Python libraries designed to work with Excel data, including openpyxl and pandas . While both are very powerful and useful, openpyxl is easier to perform simple Excel tasks such as reading in, editing, and saving back to Excel. This example shows the use of openpyxl to read in the PBC trial balance, clean it up to be import-ready in a new tab, and save as a new file. ### Take an example of a trial balance formatted like this: ![pbc tb](https://github.com/danshorstein/ficpa_article/blob/master/images/example_1/pbc_tb.png) ### After running example_1_tb.py, the output file includes a new tab with this data:
![import tb](https://github.com/danshorstein/ficpa_article/blob/master/images/example_1/output.png) ## Example 2: Scraping the web This simple code pulls down the authors and excerpt of their testimonial from the first three testimonials on FICPA's testimonials page. This uses the requests and beautifulsoup Python libraries, which are two very powerful libraries for interacting with websites. ~~~~python import requests from bs4 import BeautifulSoup r = requests.get('http://www.ficpa.org/Content/Members/Member-Testimonials.aspx') soup = BeautifulSoup(r.text, 'lxml') testimonials = soup.find_all('div', class_='testimonial-wrapper') for testimonial in testimonials[:3]: author = testimonial.find(class_='testimonial-author').get_text() excerpt = testimonial.get_text().lstrip() print('Author: {}'.format(author)) print('Exerpt: {}'.format(excerpt[:60])) print('-------------------------------------------------------------------') ~~~~ An example of resulting output is:
Author: John Smith, CPA — Smith & Smith, LLC
Excerpt: Joining the FICPA and having the chance to participate in th
-------------------------------------------------------------------
Author: Jamie J. Johnson — J. J. Johnson & Associates, PA, CPA
Excerpt: I will always feel honored to be able to contribute – and be
-------------------------------------------------------------------
Author: Bobby L. O’Charley — Longfellow Consulting Group
Excerpt: I recently attended the 2014 University of South Florida Acc
-------------------------------------------------------------------
## Example 3: Extracting tables from PDFs This is one of my new favorite tools. pdfplumber can extract text, and even identify tables, from PDF files. This example uses the PDF file from https://www.opm.gov/policy-data-oversight/data-analysis-documentation/federal-employment-reports/reports-publications/salary-information-for-the-executive-branch.pdf ### Let's say you wanted to extract the data from this table on pg 2: ![pdf table](https://github.com/danshorstein/ficpa_article/blob/master/images/example_3/pdf_table.png) Using the Python code in example 3, the output looks like this: ![pdf table output](https://github.com/danshorstein/ficpa_article/blob/master/images/example_3/pdf_table_csv.png)

近期下载者

相关文件


收藏者