RxRefund- Idil Sukas, Nilay Altun, Gulce Basar
DEFINITION
Project Background and Overview
Pharmacists buy medicine in bulk sizes. When the medicine expires, they throw the remaining medicine in a box in a corner and no one does anything to get a bit of the money that was paid for it. This project is a part of the software, which informs pharmacists about the possible income that could be gained from the expired pills. Each pill has an expration date, a manufacturer and a return adress for the expired pills. This information is stored on the label on the pill bottles. The main software aims to create a software which will store the relevant information from a taken picture of a bottle, and will inform the pharmacist about the amount they could get back from the expired pills in different timeframes. Then the software will also have the adress and information of the collector of the pills to return it. Our part as our senior project is to get the taken image of the bottle, read the necessery information from the picture depending on the label type it has, and store that information so it can be used in the designed software.
Business Objectives
Label Reading
First objective is digitalizing label of pill boxes. For digitalizing, we need to use a scanner or digital camera and a OCR system.
- Understanding the usage of OCR technology and functionality.
- Finding the appropriate OCR software for project usage.
- Cost research of avaiable softwares in market.
- Selection of hardware systems; digital camera, scanner.
- Getting the editable and searchable data from the pill boxes and keeping them in spesific format.
- Decision of format, researching for the best format for easy search and usage of data.
What is OCR and OCR Technology?
Optical Character Recognition, or OCR, is a technology that enables you to change different types of documents. For example; scanned paper or box, PDF files, Word files or images captures by using digital camera. We aim to OCR system for read a pill bottle. There are three basic principles are already well known behind OCR: integrity, purposefulnessa and adoptability.
How to use OCR Software?
ABBYY FineReader is an optical character recognition software and easy to use. There are a few stages: open or scan documents(images), recognize it and then save in a text format (or another convenient format).
Alternative OCR Software
+ : extracted - : not extracted x : no need to extract NA : not available
We tried several OCR softwares and compared these softwares by usabilities. Decided to use ABBYY Fine Reader due to accurate results. Full version of ABBY Fine Reader will be purchased and it will work on a remote server when inquire.
Alternative Label Reading Technology
We aim to use OCR system for read a pill bottle. However, we also want to be prepared for unexpected problems, so C++ and OpenCv could be alternative for read a pill bottle. Pc based OCR or cloud base OCR. The label images are processed by computer text recognition programs and can be edited by developers for giving them order.
- Being familiarized with OpenCv programming
- Researching the compatibility of software with our project.
Sample Inputs
Database Usage&Creation
For getting the information about the medicine such as manufacturer name, expiration date, information of collector, number of pills in a full bottle etc. For obtaining these data, it requires a database which keeps these informations
-Creating New Database:
- Finding the proper database, possibly Oracle Database because we are familiar
- Researching all the medicines in market and keeping the information on created database
- Possible problems may occur such as time problem.
-Using Existing Database:
- Contacting with pill manufacturers and asking for access to their databases.
- Market research for finding a company that shares database as a service.
Control of Expiration Date Control of expiration date is main objective of project. Uses the information which is getting from the label reading. Checks the other information that pill have by using the database that we generated.
Sample Output
Warning System
Warning sytem is the main output of the project. Shows the given label’s expiration date. If the pill is expired, program automatically gives a warning message for user and also the information of pill manufacturer for refund process.
Project Objectives
Short term objectives Short-term goals are mainly focused on deciding proper technology (OCR or OpenCV), database, and preparing documentations. The other one is meeting the owner of company and reviewing details.
Long term objectives In the long term, we aim to develop prototype by using OCR technology. Prototype includes convert image to integer or something proper type. However, it can be some problems by reading because it is only prototype. This prototype will show us the way we will watch in the coming months.
Project Constraints
Cost: Since the project has a budget which is determined by the company, other helper softwares that will be used while reading and converting the image will be choosen in a useful and efficient way which will fit that budget.
Time: Since this is our senior project, the time to do the necessary research and to succesfully finish converting the images, the time we have is limited to two semesters. Having limited time might cause problems for the developing and improving stages of the project.
Experience: Since we dont have many experience in the giving field, our lack of experience may cause the research stage of the project to be longer tan expected. And the time that takes to detect and correct a mistake could take longer time too.
APPROACH
The Project Schedule
Project will end after eight month. We prepared the first four month plan. According to our plan the first two month will be researched and using technologies (previous similar project and technologies) and prepared documentation (the correct planning). The last two month will be developed sample prototype.
PROJECT COSTS / BUDGET
We have a budget up to 500$. The followings are the prices of the OCR softwares we tried:
FreeOCR: Free
FreeOCR Word: Free
ABBYY FineReader: $ 169.99
SimpleOCR: Free
PowerPDF: $ 99.99
Soda Pdf 8: $ 99.00 (On sale to $49.95)
ABBYY PDF: $ 79.99
Readiris: £ 69,00
Omnipage: $ 499.99
PROJECT MANAGEMENT PLAN
Analysis:
Data Analysis
We discussed the most important data we should get from the bottles.We decided to start with 5 labels. These labels are the NDC number, Manufacturer, Expiration date, Lot No and the price. After getting those 5 labels and storing it in the database, we will also store the pictures. We discussed the possibilities of taking multiple pictures of the bottle or using multiple cameras at once. We also decided that we will store those information in a database and than use them from the database in the program. We also discussed possible or options. Jamie said that the program must be mainly windows based and the following research must be done accordingly.
- NDC Number: It is short for national drug code. It identifies the labeler, product, and trade package size. The first segment, the labeler code, is assigned by the FDA. A labeler is any firm that manufactures (including repackers or relabelers), or distributes (under its own name) the drug. It goes as labeled-product code-package code. Its the same number thats on the barcode.
- Manufacturer of the drug: Manufacturers name and adress is stored on the pill bottle. This information inside to return the bottle to the correct adress.
- Expiration date: The expiration date is the final day that the manufacturer guarantees the full potency and safety of a medication. Drug expiration dates exist on most medication labels, including prescription, over-the-counter (OTC) and dietary (herbal) supplements. U.S. pharmaceutical manufacturers are required by law to place expiration dates on prescription products prior to marketing.
- Lot No: Lot numbers are issued by the manufacturer of the product. When manufacturers produce a product, they do so in batches. Each batch is assigned a unique number that makes it possible for manufacturers to track exactly when a problem occurred and which products need to be recalled.
- Price: The price pharmacist paid for that particular bottle. It is shown with 3 letters which is on the sticker the pharmacy puts on the bottle. The price tag has a fixed place on the sticker that all pharmacists use. It uses the word "PRECAUTION" as the keyword for "1234567890".
- Purchase Date: The date pharmacist purchased the bottle. It follows the format YMMDD.
- Pill Number: It is the number of pills the bottle has. It will be gathered through the NDC number, not from the image.
Resource Handling
Bottle 1:
1- NDC number is 68180-469-01; 68180 is the labeller, 469 is the product code and 01 is the package code.
2- Manufacturer is LUPIN. (Labeler part of the NDC number #68180 for lupin)
3- Expiration date is November 2017.
4- Lot Number is G500211.
5- Price is OEA; O-9, E-3, A-5 so, $9,35.
6- Purchase date is 50511 so its 11 May 2015.
Bottle 2:
1-NDC number is 68180-351-09; 68180 is the labeller, 351 is the product code and 09 is the package code.
2- Manufacturer is LUPIN. (Labeler part of the NDC number #68180 for lupin)
3- Expiration date is July 2017.
4- Lot Number is G406895.
5- Price is AAA; A-5 so, $5,55.
6- Purchase date is 50324 so its 24 March 2015.
Bottle 3:
1-NDC number is 0093-7366-98; 0093 is the labeller, 7366 is the product code and 98 is the package code.
2- Manufacturer is TEVA.
3- Expiration date is Sept 2016.
4- Lot Number is 41L564.
5- Price is IUU; I-8, U-6 so, $8,66.
6- Purchase date is 50323 so its 23 March 2015.
Bottle 4:
1-NDC number is 0456-3210-60; 0093 is the labeller, 7366 is the product code and 98 is the package code.
2- Manufacturer is Forest.
3- Expiration date is Dec 2019.
4- Lot Number is A436718.
5- Price is EENUR; E-3, N-0, U-6, R-2 so, $330,62.
6- Purchase date is 50513 so its 13 May 2015.
OCR Trials
This picture shows that OCR was able to analyse the first photo as text. This page has the NDC number and the manufacturer.
This picture shows that the OCR failed to analyzed the photo correctly and labeled all of it as picture.
This picture shows that after changing the default OCR analyzing of the photo and labelling Lot number and exp date as text, OCR was able to get those values into text format.
This picture shows that the OCR failed to analyzed the photo correctly because it was crooked. This photo has the price.
This picture shows that after selecting the sticked part and fission its alignment, OCR was able to get the price into text format.
Design
The data attributes RxRefund website has are same with the fields in the analysis section. For inquiry, fields are filled by manually. For reaching the website we suggested a user id and password so we can validate the data we have. Website has two options for the expiration date, you can type in the date or you can choose one from the provided expiration dates if it expires on a close date. Other extra field are partial quantity and return reason. Return reason if its not because the pill has expired must be chosen by the pharmacist. Partial quantity is the remaining number of pills the bottles has. The pharmacist will also type it in.
Hardware
We will either use a single camera and take multiple pictures of the bottle by turning it, or we will use multiple cameras so that the bottle can be put in the middle of them and take the pictures at the same time. Using multiple cameras will take a shorter time for the pharmacist and it would also lower the error rate that we could get by pharmacist not taking the picture correctly.
MEETING NOTES
Camera: We discussed if we should use 4 cameras or use a single camera to take a turning video of the bottle to be abe to get a panaromic picture pf ot
Database: We decided on the labels we will need from the database and decided to arrange a meeting with the website developer. We also discussed opening a test server.
Program: We decided to start by using the ocr program manually to convert picture to text, then we will start working on the program(probably java) to read the text file and get the labels by searching specific keywords. (We will start with ndc) Then we will store the extracted data in an excel file or in the test database server.
























