Skip to content

Is it possible to read comments and chart titles? #478

@KfirAlfa

Description

@KfirAlfa

Hello,

Thanks for the awesome crate!!

I'm using Calamine to parse Xlsx files, and it works great. However, I didn't manage to extract comments and chart titles using calamine, but I had to parse them with quickxml. (see code below)

@tafia Can this functionality be added to Calamine (if it doesn't already exist)? I don't want to unzip the file twice.
I'll be happy to open PR if you could guide me to the places in the code where changes need to be made.

Thanks again for providing the crate!

fn extract_comments(
    archive: &mut ZipArchive<impl Read + Seek>,
) -> Result<Option<Comments>> {
    let comments_xmls: Vec<String> = archive
        .file_names()
        .filter(|name| name.starts_with("xl/comments"))
        .map(String::from)
        .collect();
    if comments_xmls.is_empty() {
        return Ok(None);
    }
    let mut comments = Comments {
        authors: None,
        comments: None,
    };
    for comment_file in comments_xmls {
        let mut file = archive.by_name(&comment_file)?;
        let mut contents = String::new();
        std::io::Read::read_to_string(&mut file, &mut contents)?;

        let mut reader = XmlReader::from_str(&contents);
        let mut buf = Vec::new();
        let mut in_authors = false;
        let mut in_author_text = false;
        let mut in_comments = false;
        let mut in_comment_text = false;

        loop {
            match reader.read_event_into(&mut buf) {
                Ok(Event::Start(e)) => match e.name().as_ref() {
                    b"authors" => in_authors = true,
                    b"author" => in_author_text = true,
                    b"comments" => in_comments = true,
                    b"comment" => in_comment_text = true,
                    _ => {}
                },
                Ok(Event::Text(e)) => {
                    if in_author_text && in_authors {
                        let author = e.unescape()?.into_owned();
                        comments.authors.get_or_insert(vec![]).push(author);
                    }
                    if in_comment_text && in_comments {
                        let comment = e.unescape()?.into_owned();
                        comments.comments.get_or_insert(vec![]).push(comment);
                    }
                }

                Ok(Event::End(ref e)) => match e.name().as_ref() {
                    b"authors" => in_authors = false,
                    b"author" => in_author_text = false,
                    b"comments" => in_comments = false,
                    b"comment" => in_comment_text = false,
                    _ => {}
                },
                Ok(Event::Eof) => break,
                Err(e) => return Err(XlsxParseError::Xml(e)),
                _ => {}
            }
        }
        buf.clear();
    }

    Ok(Some(comments))
}

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions