Skip to content

Default config for vertical grouping has bad results for vertically aligned checkboxes #24

@martinkozle

Description

@martinkozle

I am creating this issue to help anybody having the same issue with vertically aligned checkboxes not being detected well.

The group_size_range config option gets overwritten to a hardcoded value of (1, 1) at the start of the get_checkboxes pipeline. So setting that config option does nothing when using this function.

By default in the config the vertical_max_distance option is set to 10, meaning if you are trying to detect vertically aligned checkboxes (like in a form) it will give really bad results as it will see the whole column as a single group. I don't know if this is intended and what the use case is. I don't quite understand the grouping logic in the library.

Ways to fix it would be to either set this option to 0, and then find and filter out unwanted close detections with your own needed logic. Or copy over the get_checkboxes function without that first hardcoding line (but this might group horizontal checkboxes). I don't understand the difference between the vertical and the horizontal grouping but vertical grouping for checkboxes seems to be a bit faulty.

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueGood for newcomersquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions