Commit graph

182 commits

Author SHA1 Message Date
Florent
09b0de545e set replace_dots default value to False 2014-08-21 10:27:20 +02:00
Florent
ce133dcf8a Fix documentation of nr parameter 2014-08-19 08:58:12 +02:00
Laurent Bachelier
8c3e92aa31 fix re flags usage for Python 2.6
fixes #1444
2014-08-13 19:49:24 +02:00
Laurent Bachelier
3a3e3d0438 Help debug Filter errors 2014-08-05 20:27:48 +02:00
Laurent Bachelier
94deb53352 Add support for a default in Env 2014-08-05 20:27:48 +02:00
Laurent Bachelier
a0559e539e CleanText improvements
* \t is always in \s so no need to add it
* handle the non-breaking space thanks to the re.UNICODE flag
* add an option to keep (but normalize) newlines
* more tests
2014-08-05 20:27:48 +02:00
Laurent Bachelier
b6c6ed2306 Fix filters doctests and add them to the usual test run 2014-08-05 20:27:48 +02:00
Laurent Bachelier
d19e0637e4 CleanText: Always return unicode 2014-08-05 20:27:48 +02:00
Laurent Bachelier
819de1ace0 Do not crash if total_seconds() is not implemented
Which is the case with Python 2.6.
We could do the calculation ourselves, but this is not a very important
feature.
2014-08-05 20:27:48 +02:00
Romain Bignon
1005197a92 add CsvPage 2014-07-11 15:24:24 +02:00
Romain Bignon
c90b5844e4 split filters into several files 2014-07-11 15:24:19 +02:00
smurail
8cb44a45a7 possibility to set custom separators for decimals 2014-07-09 19:02:23 +02:00
Florent
6759dec279 Add missing import 2014-07-09 14:58:49 +02:00
Florent
9ae7cc692f Fix a regression: restore ListItem version
And move it in the same place than before to help the diff...
2014-07-09 14:51:34 +02:00
Florent
1daa866949 Move the import oh html2text outside of misc 2014-07-09 11:43:14 +02:00
Florent
fb555c3079 Do not import lxml in headers of page.py 2014-07-09 10:23:24 +02:00
Florent
8a2a1ece5e Do not always import module used only in debug mode 2014-07-09 10:23:24 +02:00
Florent
76cb004eb4 Move ItemListTable-Element outside of page.py
One of the goal is to not import all modules needed by filters by
loading the page file.

In the same goal, move the import of parsers in the class definition.
2014-07-09 10:23:20 +02:00
Romain Bignon
983ed221e2 ability to use filters as classes in chain (refs #1426) 2014-07-05 20:22:45 +02:00
Romain Bignon
2268eb2ff1 ability to use Dict['a']['b']['c'] instead of Dict('a/b/c') (refs #1426) 2014-07-05 20:22:39 +02:00
Romain Bignon
8efd37e71d overload & and | operators to chain filters (refs #1426) 2014-07-05 20:00:04 +02:00
Bezleputh
ac161104ea [filter] manage basestring entry in CleanHTML filter 2014-07-05 14:24:37 +02:00
Bezleputh
96271b6de4 [filters] manage default in Dict filter 2014-07-05 14:24:14 +02:00
Laurent Bachelier
73cd8762f5 Allow for a default argument in MultiFilter 2014-07-05 14:23:29 +02:00
Laurent Bachelier
3f2d8ae185 Allow for a custom element finder
And end up with less duplicate code!
2014-07-05 14:23:27 +02:00
Romain Bignon
18c1f46922 ability to override the flush() method 2014-07-01 20:37:58 +02:00
Vincent Paredes
714a0e7617 matching content with url using is_here 2014-07-01 15:52:51 +02:00
Laurent Bachelier
b9c6176628 browser2: Allow setting query string params on build_url
The outcome is exactly the same as using requests with the "params"
parameter.
2014-06-20 17:58:51 +02:00
Laurent Bachelier
5dd0e9e0ec Small style fixes 2014-06-20 17:58:51 +02:00
Laurent Bachelier
b013828ad0 browser2: Add a filter to change the base element used for selectors 2014-06-17 00:48:30 +02:00
Laurent Bachelier
04cec70e1f browser2 filters: Force unicode, little style fixes
lxml for Python2 has the tendency to return str instead of
unicode when the contents are pure ASCII.
Try to fix the nonsense.
2014-06-04 00:58:35 +02:00
Laurent Bachelier
9619ddcaa2 browser2: Add RawText filter
Allows getting .text of elements without any alteration.
This is useful for at least textarea and pre tags.

Maybe the .join character should be configurable.
2014-06-04 00:49:00 +02:00
Laurent Bachelier
c69c5cf5ef browser2: More specialized exceptions
and extend common exceptions
2014-06-03 22:28:21 +02:00
Laurent Bachelier
e01fda826c filters: Properly handle defaults that are not datetimes 2014-05-27 17:41:51 +02:00
Romain Bignon
6d451e5f34 Date filter: use default value for empty input 2014-05-27 12:21:24 +02:00
Bezleputh
3c4f8d35e0 [Filters] convert date in english in DateTime 2014-05-21 17:54:55 +02:00
Romain Bignon
a5f95183a7 fix syntax error 2014-05-19 07:50:45 +02:00
Romain Bignon
c409675e6c fix compatibility with python2.6 2014-05-19 07:23:12 +02:00
Laurent Bachelier
14b1b56914 browser2: Add an option to convert POST data to the proper encoding
And autodetect it on forms. There is no other way to know what is the
expected encoding.
2014-05-19 01:01:25 +02:00
Romain Bignon
3e1dec519e move ParseError into weboob.tools.exceptions 2014-05-17 14:27:55 +02:00
Romain Bignon
6fcac89dd5 first step in python3 support 2014-05-17 14:27:55 +02:00
Romain Bignon
686a3b77e8 fix URL.id2url (give the browser instance to URL.build()) 2014-05-17 14:27:34 +02:00
Laurent Bachelier
82f47bff88 Allow forcing a Page content encoding 2014-05-16 15:37:24 +02:00
Laurent Bachelier
e01b39c8d2 Also ignore URLs where all kwargs were not used 2014-05-16 11:47:25 +02:00
Laurent Bachelier
6e9910ae9a Only use full-name substitutions, to allow % in URLs 2014-05-16 11:47:25 +02:00
Roger Philibert
6031ff1ef9 Form.submit can take extra parameters given to location() 2014-05-09 22:59:36 +02:00
Bezleputh
fadd88dafc [browser2] Add a Dict filter 2014-05-06 22:32:41 +02:00
Romain Bignon
ab710e0f74 support GET forms 2014-04-29 22:00:49 +02:00
Romain Bignon
61bc712068 Revert "Detect duplicate objects with id "0""
This reverts commit 6cae2cd0a5.
2014-04-26 12:07:22 +02:00
Florent
6cae2cd0a5 Detect duplicate objects with id "0" 2014-04-24 16:18:19 +02:00