weboob-devel/modules/canalplus/browser.py
Laurent Bachelier b4b7182960 Make Collection more safe and sane
* Remove callbacks in Collection object
  Make Collection a "dumb" object (and also a base object,
  though it isn't very useful for now)
* Rename Path to WorkingPath, because it is more about managing state
  than being a single path.
* Rewrite almost all WorkingPath, because the code was overly
  complicated for no reason (I tried some special cases and it turned
  out that fromstring didn't handle them, and that the
  quote-escape-unquote was just unecessary). I also rewrote it to be
  more pythonic (no more lambdas and maps) and added tests.
* Require the full split path when creating a Collection. Because, come to
  think of it, an object needs an unique identifier; in the case of
  Collections, it is the full path, not only its last part.
  I might even replace the id by the full split path in the future.
* There is now only one way to get items of a Collection: calling
  iter_resources().
* Rewrite flatten_resources to iter_resources_flat(), which just calls
  iter_resources() recursively.
* Rewrite the collection part of the canalplus module. There is no more
  callback or a page calling the browser to check another page!
  The logic is only in iter_resources().
  The resulting code is not very pretty, but it should get better.
  As a bonus, avoid to reload the main XML file when we already have it
  open.
* change_path() now expects a split path and not a string.
* up/home special cases for "cd" are handled in the same place, and
  store the previous place properly (but are not yet exploitable by
  an user command).

This is a big commit but it would be hard to split it in *working*
commits.

If you read this entire commit message, I will buy you a beer.

refs #774
fixes #773
2012-03-13 22:08:45 +01:00

108 lines
3.9 KiB
Python

# -*- coding: utf-8 -*-
# Copyright(C) 2010-2011 Nicolas Duhamel
#
# This file is part of weboob.
#
# weboob is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# weboob is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with weboob. If not, see <http://www.gnu.org/licenses/>.
import urllib
import lxml.etree
from weboob.tools.browser import BaseBrowser
from weboob.tools.browser.decorators import id2url
from .pages import InitPage, VideoPage
from .video import CanalplusVideo
from weboob.capabilities.collection import CollectionNotFound
__all__ = ['CanalplusBrowser']
class XMLParser(object):
def parse(self, data, encoding=None):
if encoding is None:
parser = None
else:
parser = lxml.etree.XMLParser(encoding=encoding, strip_cdata=False)
return lxml.etree.XML(data.get_data(), parser)
class CanalplusBrowser(BaseBrowser):
DOMAIN = u'service.canal-plus.com'
ENCODING = 'utf-8'
PAGES = {
r'http://service.canal-plus.com/video/rest/initPlayer/cplus/': InitPage,
r'http://service.canal-plus.com/video/rest/search/cplus/.*': VideoPage,
r'http://service.canal-plus.com/video/rest/getVideosLiees/cplus/(?P<id>.+)': VideoPage,
r'http://service.canal-plus.com/video/rest/getMEAs/cplus/.*': VideoPage,
}
#We need lxml.etree.XMLParser for read CDATA
PARSER = XMLParser()
FORMATS = {
'sd': 'BAS_DEBIT',
'hd': 'HD',
}
def __init__(self, quality, *args, **kwargs):
BaseBrowser.__init__(self, parser=self.PARSER, *args, **kwargs)
self.quality = self.FORMATS.get(quality, self.FORMATS['hd'])
def home(self):
self.location('http://service.canal-plus.com/video/rest/initPlayer/cplus/')
def search_videos(self, pattern):
self.location('http://service.canal-plus.com/video/rest/search/cplus/' + urllib.quote_plus(pattern.replace('/', '').encode('utf-8')))
return self.page.iter_results()
@id2url(CanalplusVideo.id2url)
def get_video(self, url, video=None):
self.location(url)
return self.page.get_video(video, self.quality)
def iter_resources(self, split_path):
if not self.is_on_page(InitPage):
self.home()
channels = self.page.get_channels()
if len(split_path) == 0:
for channel in channels:
if len(channel.split_path) == 1:
yield channel
elif len(split_path) == 1:
for channel in channels:
if len(channel.split_path) == 2 and split_path[0] == channel.split_path[0]:
yield channel
elif len(split_path) == 2:
subchannels = self.iter_resources(split_path[0:1])
channel = None
for subchannel in subchannels:
# allow matching by title for backward compatibility (for now)
if split_path[0] == subchannel.split_path[0] and \
split_path[1] in (subchannel.split_path[1], subchannel.title):
channel = subchannel
if channel:
self.location("http://service.canal-plus.com/video/rest/getMEAs/cplus/%s" % channel.id)
assert self.is_on_page(VideoPage)
for video in self.page.iter_channel():
yield video
else:
raise CollectionNotFound(split_path)
else:
raise CollectionNotFound(split_path)