You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Python module to find valid copyright and license expressions in a file.
3
-
This module is based on Google LicenseClassifier.
1
+
# Golicense-Classifier
2
+
A Python based module to find valid copyright and license expressions in a file.
3
+
4
+
_Note: This module is based on Google LicenseClassifier._
5
+
6
+
## Installation
7
+
Currently, this package only supports Linux Platform. Work is in progress for Windows and Mac.
8
+
9
+
To install from Pypi, use
10
+
```sh
11
+
pip install golicense-classifier
12
+
```
13
+
14
+
## Usage
15
+
To get started, import `LicenseClassifier` class from the module as
16
+
17
+
```python
18
+
from LicenseClassifier.classifier import LicenseClassifier
19
+
```
20
+
21
+
_Note: Work on Copyright Statement is still in progress. Expect some issues, mostly with binary files_
22
+
23
+
The class comes bundled with several functions for scanning purpose.
24
+
25
+
1.`scan_directory`
26
+
27
+
This method is used to recursively walk through a directory and find license expressions and copyright statements. It returns a dictionary object with keys `header` and `files`.
28
+
29
+
### Usage
30
+
___
31
+
```python
32
+
classifier = LicenseClassifier()
33
+
res = classifier.scan_directory('PATH_TO_DIR')
34
+
```
35
+
### Optional Parameters
36
+
___
37
+
-`max_size`
38
+
39
+
Maximum size of fileinMB. Default isset to 10MB. Set `max_size <0` to ignore size constraints
40
+
41
+
-`use_buffer`
42
+
43
+
`(Experimental)` Set `True` to use buffered file scanning. `max_size` will be used as buffer size.
44
+
45
+
46
+
2. `scan_file`
47
+
48
+
This method is used to find license expressions andcopyright statements on a single file.
49
+
50
+
### Usage
51
+
___
52
+
```python
53
+
classifier = LicenseClassifier()
54
+
res = classifier.scan_file('PATH_TO_FILE')
55
+
```
56
+
### Optional Parameters
57
+
___
58
+
-`max_size`
59
+
60
+
Maximum size of fileinMB. Default isset to 10MB. Set `max_size <0` to ignore size constraints
61
+
62
+
-`use_buffer`
63
+
64
+
`(Experimental)` Set `True` to use buffered file scanning. `max_size` will be used as buffer size.
65
+
66
+
## Setting Custom Scanning Threshold
67
+
68
+
You can set custom threshold for scanning purpose that best suits your need. For this, you can use parameter `threshold`while making objectas
0 commit comments