Assembling social movement organizations from Stanford tags

The Stanford NER tagger tags individual words as SMO or not. For example, Occupy Wall Street is returned as `[('Occupy', 'ORGANIZATION'), ('Wall', 'ORGANIZATION'), ('Street', 'ORGANIZATION')]`.

To parse this into a single string I've made the assumption that all consecutive organization tags indicate the same SMO. Does this seem like a reasonably robust approach, or should we try to come up with something else? 

It seems to work as long as punctuation is included as separate tokens (i.e. a list of SMOs is separated by non-organization tagged commas), but I probably haven't thought about all edge cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assembling social movement organizations from Stanford tags #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Assembling social movement organizations from Stanford tags #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions