[doc/guides/module] use browser2

2015-06-17 23:45:52 +02:00 · 2015-06-17 23:45:52 +02:00 · 7f10865215
commit 7f10865215
parent 4881bb0be7
1 changed files with 39 additions and 32 deletions
--- a/docs/source/guides/module.rst
+++ b/docs/source/guides/module.rst
@ -347,52 +347,59 @@ When your browser locates on a page, an instance of the class related to the
 is created. You can declare methods on your class to allow your browser to
 interact with it.
-The first thing to know is that your instance owns these attributes:
+The first thing to know is that page parsing is done in a descriptive way. You
-
+don't have to loop on HTML elements to construct the object. Just describe how
-* ``browser`` - your ``ExampleBrowser`` class
+to get correct data to construct it. It is the Browser class work to actually
-* ``logger`` - context logger
+construct the object.
 * ``encoding`` - the encoding of the page
 * ``response`` - the ``Response`` object from ``requests``
 * ``url`` - current url
 * ``doc`` - parsed document with ``lxml``
 The most important attribute is ``doc`` you will use to get information from the page. You can call two methods:
 * ``xpath`` - xpath expressions
 * ``cssselect`` - CSS selectors
 For example::
    from weboob.browser.filters.html import Attr
    from weboob.browser.filters.standard import CleanDecimal, CleanText
    from weboob.capabilities.bank import Account
    class ListPage(LoggedPage, HTMLPage):
-        def get_accounts(self):
+        @method
-            for el in self.doc.xpath('//ul[@id="list"]/li'):
+        class get_accounts(ListElement):
-                account = Account()
+            item_xpath = '//ul[@id="list"]/li'
                account.id = el.attrib['id']
                account.label = el.xpath('./td[@class="name"]').text
                account.balance = Decimal(el.xpath('./td[@class="balance"]').text)
                yield account
-An alternative with ``cssselect``::
+            class item(ItemElement):
                klass = Account()
-    from weboob.capabilities.bank import Account
+                obj_id = Attr('id')
                obj_label = CleanText('./td[@class="name"]')
                obj_balance = CleanDecimal('./td[@class="balance"]')
-    class ListPage(LoggedPage, HTMLPage):
+As you see, we first set ``item_xpath`` which is the xpath string used to
-        def get_accounts(self):
+iterate over elements to access data. In a second time we define ``klass`` which
-            for el in self.document.getroot().cssselect('ul#list li'):
+is the real class of our object. And then we describe how to fill each object's
-                id = el.attrib['id']
+attribute using what we call filters.
-                account = Account()
+
-                account.id = el.attrib['id']
+Some example of filters:
-                account.label = el.cssselect('td.name').text
+
-                account.balance = Decimal(el.cssselect('td.balance').text)
+* :class:`Attr <weboob.browser.filters.html.Attr>`: extract a tag attribute
-                yield account
+* :class:`CleanText <weboob.browser.filters.standard.CleanText>`: get a cleaned text from an element
 * :class:`CleanDecimal <weboob.browser.filters.standard.CleanDecimal>`: get a cleaned Decimal value from an element
 * :class:`Date <weboob.browser.filters.standard.Date>`: read common date formats
 * :class:`Link <weboob.browser.filters.html.Link>`: get the link uri of an element
 * :class:`Regexp <weboob.browser.filters.standard.Regexp>`: apply a regex
 * :class:`Time <weboob.browser.filters.standard.Time>`: read common time formats
 * :class:`Type <weboob.browser.filters.standard.Type>`: get a cleaned value of any type from an element text
 Filters can be combined. For example::
    obj_id = Link('./a[1]') & Regexp(r'id=(\d+)') & Type(type=int)
 This code do several things, in order:
 #) extract the href attribute of our item first ``a`` tag child
 #) apply a regex to extract a value
 #) convert this value to int type
 .. note::
   All objects ID must be unique, and useful to get more information later
 Your module is now functional and you can use this command::
    $ boobank -b example list