Skip to content

Suggest add an option to ignore special encoding characters #20

@ghost

Description

Hi, this tool works well in many cases. But I found two problems.

  1. Encoding problem

If a file contains other encoding characters, e.g., Chinese characters and ½, an exception will occur in extract_comments method.

I added "errors='ignore'" in the following statement on my local computer, and it can ignore the above special characters and continue to extract the rest characters of a comment.

def extract_comments(filename, mime=None):
    with open(filename, 'r', errors='ignore') as code: 

So I think we can provide this option to users and let them determine to ignore or not.

  1. Complex string

The tool throws an exception when parser this java file. I found the cause may be the complex string in line 99.

Thanks for your tool, it helps me a lot. Hope better~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions