AUTOMATIC EXTRACTION OF PLAN FILLED-IN INFORMATION FROM BANKCHECKS BASED ON PRIOR • Automatic Bankcheck Processing KNOWLEDGE ABOUT LAYOUT • Characteristics of Checks STRUCTURE • Bankcheck Modeling • Background and Printed Info Elimination • Experimental Results Alessandro L. Koerich - CEFET/PR • Conclusion Lee Luan Ling - UNICAMP Two Main Topics of Research Automatic Bankcheck Processing • Information Extraction • Millions of handwritten and machine printed – check identification through MICR line bankchecks have to be processed every day – based on prior knowledge about layout structure • 260 millions p/m • Handwritten or Machine Printed Bankcheck – database Processing • background patterns • customer’s data • Only the information encoded in the MICR line can be handled automatically • Information Processing • bank, agency, account, check, serial and verification – Handwriting Recognition codes • digit amount, worded amount, payee’s name, date, city • The filled-in information is manually handled – Signature Verification
Bankcheck Modeling Characteristics of Bankchecks • The division in blocks is not sufficient • Using knowledge about the basic structure • We must consider the overlapping of of a document to process any document of information the same type. • We propose that a check can be divided in • Brazilian Checks three layers – Complex Layout Structure • Background Pattern – Standard Size • Printed Information – Standard Layout • Filled-in Information Background and Printed Information Background and Printed Elimination Information Elimination (cont.) • Position Adjustment • The background pattern and the printed – Skew information only disturbs the processing of – Vertical the filled-in information – Horizontal • Goal • Background Elimination – Eliminate the background pattern without – Subtraction degrade the filled-in parts • I WB (x,y) = I CD (x,y) - I CB (x,y)
Extraction of Filled-in Information Baselines Elimination • Information introduced by bank’s customers • Compute horizontal projection profiles – Digit amount, worded amount, payee’s name, • The points with high values of PPh indicate city and date the position of baselines • Relies on prior knowledge about bankcheck layout structure • For these positions convert back pixels to with pixels • Check identification through MICR line • Standardized Layout --- Template Experimental Results Printed Characters Elimination • Printed characters strings appearing under • Real Brazilian bankcheck images the baseline dedicated to signature • 200 dpi and 256 gray levels • Customer’s name, register identification number • 100 real bankcheck images • Generate a binary image which contains the similar information – 100 Financial Institutions • Database – 25 different writers • Subtraction • I WP (x,y) = I WB (x,y) - I GP (x,y)
Conclusions • Method for extracting the filled-in information from bankchecks • Extraction of different items of information • digit amount, worded amount, payee’s name, date, city and signature • Method provide satisfactory results • Post-processing to improve quality • Automatic Bankcheck Recognition System
Recommend
More recommend