Posts

Showing posts from May, 2018

GSoC 2018: Week 2

Image
It's been 12 days since Coding Period of Google Summer of Code 2018 started and 7 days since my last blog. Debian's GSoC admin have decided to ask for a weekly log from students working with Debian (every Friday). So here is what I did from 19th May to 25th May.




This week we made a new organisation invoice-x for easier management and reviewing purposes. Now, both the libraries invoice2data and factur-x have been moved there. I have been solving up coming issues on invoice2data and improving factur-x library.
As I mentioned in my previous blog, that I will be covering tests for 'invoice2data as a library'. So I did do it. Link to commit.I have added checks for tesseract module, which verifies installation of tesseract and imagemagick modules in invoice2data. It was related to issue-92Added related project section to invoice2data and factur-x. Issue-14Then, I spent time on reading about French government's new standard to exchange electronic invoices, called, Factur-X t…

GSoC 2018: Week 1

Image
It's been 5 days since Coding Period of Google Summer of Code 2018 has started. Debian's GSoC admin have decided to ask for a weekly log from students working with Debian (every Friday). So here is what I did from 14th May to 18th May.



I have been working on fixing current issues with invoice2data library. So here is the list I have made it possible to dump all fields to json and csv files. #114We have adopted Numpydoc convention for docstrings. Hence, I have added proper docstrings to all public functions and classes #119Then I investigated the cause of "Resource Warnings" and have come to the conclusion that those are due to upstream libraries.I have made tests for Command Line Arguments. I have been commit to issue/8 branch from invoice2data. The functions that I worked on are:content checking in json files,copy argument andexclude builtin template argumentI then further investigated tesseract-ocr module and will be working on to improve it and report better messa…

GSoC 2018, Debian: Community Bonding

Image
Hey all! Last time on "What's up April?" I ended by saying "A few more interesting things happened in these one and half month but I cannot really tell you about those right now, need to wait for a week or two". I know it has been more than two weeks 😜, almost a month now. But here it is:  **drum-roll begins** 
My proposal for Google Summer Code 2018 with Debian has been accepted. I will be working on "Extracting data from PDF invoices and bills for financial accounting"Here is the link.  23rd April to 14th May is Community Bonding period, where we, selected students, get familiar with our organisation, communicate with our mentor and try to make a workflow. For this project, I am working with three mentors, Manuel Riel,  Thomas Levine and Pieter Willem Moerenhout. 
We have been discussing the project and decided to make slight changes to my proposal. Below is the gist of it.
Despite efforts to develop new formats for the exchange of invoices, most in…