[2947] | 1 | |
---|
| 2 | Blackout filtering |
---|
| 3 | ================== |
---|
| 4 | |
---|
[2997] | 5 | Filtering introductioin |
---|
| 6 | ======================= |
---|
[2947] | 7 | |
---|
[2997] | 8 | Sitemap portal type has an option, designed to filter out |
---|
| 9 | objects which should be excluded from a sitemap. This option is |
---|
| 10 | accessable in sitemap edit form and is labeled as |
---|
| 11 | "Blackout entries". |
---|
[2947] | 12 | |
---|
[2997] | 13 | In earlier versions of the package (<4.0.1 for plone-4 branch |
---|
| 14 | and <3.0.7 for plone-3 branch) this field allowed to |
---|
| 15 | filter objects only by their ids, and looked like: |
---|
| 16 | |
---|
[2947] | 17 | <pre> |
---|
| 18 | index.html |
---|
| 19 | index_html |
---|
| 20 | </pre> |
---|
| 21 | |
---|
[2997] | 22 | So, all objects with "index.html" or "index_html" ids were |
---|
| 23 | excluded from the sitemap. |
---|
[2947] | 24 | |
---|
[2997] | 25 | In the new versions of GoogleSitemaps filtering was remade |
---|
| 26 | to pluggable architecture. Now filters became named multi |
---|
| 27 | adapters. There are only two default filters - "id" and |
---|
[2947] | 28 | "path". |
---|
| 29 | |
---|
[2997] | 30 | Since different filters can be used - new syntax was applied |
---|
[2947] | 31 | to the "Blackout entries" field. Every record in the field |
---|
[2997] | 32 | should follow the specification: |
---|
[2947] | 33 | |
---|
| 34 | [<filter name>:]<filter arguments> |
---|
| 35 | |
---|
[2997] | 36 | If no <filter name> is specified - "id" filter will |
---|
| 37 | be used. If <filter name> is specified - system will look |
---|
| 38 | for <filter name>-named multiadapter to IBlackoutFilter |
---|
| 39 | interface. If such multiadapter is not found - filter |
---|
| 40 | will be ignored without raising any errors. |
---|
[2947] | 41 | |
---|
| 42 | |
---|
[2997] | 43 | Following parts demonstrate working of the filtering. |
---|
| 44 | Aspects of default filters ("id" and "path") |
---|
| 45 | are considered yet. |
---|
[2947] | 46 | |
---|
[2997] | 47 | Setup demonstration environment |
---|
| 48 | =============================== |
---|
[2947] | 49 | |
---|
[2997] | 50 | First, we must perform some setup. We use the testbrowser that is |
---|
| 51 | shipped with Five, as this provides proper Zope 2 integration. Most |
---|
| 52 | of the documentation, though, is in the underlying zope.testbrower |
---|
| 53 | package. |
---|
| 54 | |
---|
[2947] | 55 | >>> from Products.Five.testbrowser import Browser |
---|
| 56 | >>> browser = Browser() |
---|
| 57 | >>> portal_url = self.portal.absolute_url() |
---|
| 58 | |
---|
[2997] | 59 | The following is useful when writing and debugging testbrowser tests. |
---|
| 60 | It lets us see all error messages in the error_log. |
---|
[2947] | 61 | |
---|
| 62 | >>> self.portal.error_log._ignored_exceptions = () |
---|
| 63 | |
---|
[2997] | 64 | With that in place, we can go to the portal front page and log in. |
---|
| 65 | We will do this using the default user from PloneTestCase: |
---|
[2947] | 66 | |
---|
| 67 | >>> from Products.PloneTestCase.setup import portal_owner, default_password |
---|
| 68 | >>> browser.open(portal_url) |
---|
| 69 | |
---|
| 70 | We have the login portlet, so let's use that. |
---|
| 71 | |
---|
| 72 | >>> browser.open('http://nohost/plone/login_form') |
---|
| 73 | >>> browser.getLink('Log in').click() |
---|
| 74 | >>> browser.url |
---|
| 75 | 'http://nohost/plone/login_form' |
---|
| 76 | >>> browser.getControl('Login Name').value = portal_owner |
---|
| 77 | >>> browser.getControl('Password').value = default_password |
---|
| 78 | >>> browser.getControl('Log in').click() |
---|
| 79 | >>> "You are now logged in" in browser.contents |
---|
| 80 | True |
---|
| 81 | >>> "Login failed" in browser.contents |
---|
| 82 | False |
---|
| 83 | >>> browser.url |
---|
| 84 | 'http://nohost/plone/login_form' |
---|
| 85 | |
---|
| 86 | |
---|
| 87 | Functionality |
---|
| 88 | ============= |
---|
| 89 | |
---|
[2997] | 90 | First create some content for demonstrations. |
---|
[2947] | 91 | |
---|
| 92 | In the root of the portal |
---|
| 93 | |
---|
| 94 | >>> self.addDocument(self.portal, "doc1", "Document 1 text") |
---|
| 95 | >>> self.addDocument(self.portal, "doc2", "Document 2 text") |
---|
| 96 | |
---|
| 97 | And in the memeber's folder |
---|
| 98 | |
---|
| 99 | >>> self.addDocument(self.folder, "doc1", "Member Document 1 text") |
---|
| 100 | >>> self.addDocument(self.folder, "doc2", "Member Document 2 text") |
---|
| 101 | |
---|
[2997] | 102 | We need to add sitemap for demonstration. |
---|
[2947] | 103 | |
---|
| 104 | >>> browser.open(portal_url + "/prefs_gsm_settings") |
---|
| 105 | >>> browser.getControl('Add Content Sitemap').click() |
---|
| 106 | |
---|
| 107 | Now we bring-up to edit form of the newly created content sitemap. |
---|
[2997] | 108 | We are interested in two things: "Blackout entries" field must |
---|
| 109 | present in the form and by default it should be empty. |
---|
| 110 | |
---|
[2949] | 111 | >>> file("/tmp/browser.0.html","wb").write(browser.contents) |
---|
[2947] | 112 | >>> blackout_list = browser.getControl("Blackout entries") |
---|
| 113 | >>> blackout_list |
---|
| 114 | <Control name='blackout_list:lines' type='textarea'> |
---|
[2949] | 115 | >>> blackout_list.value == "" |
---|
| 116 | True |
---|
[2948] | 117 | >>> save_button = browser.getControl("Save") |
---|
[2947] | 118 | >>> save_button |
---|
[2992] | 119 | <SubmitControl name='form...' type='submit'> |
---|
[2948] | 120 | >>> save_button.click() |
---|
[2947] | 121 | |
---|
| 122 | |
---|
[2949] | 123 | Click on "Save" button lead us to result sitemap view. |
---|
[2947] | 124 | |
---|
[2950] | 125 | >>> print browser.contents |
---|
| 126 | <?xml version="1.0" encoding=... |
---|
[2947] | 127 | |
---|
[2950] | 128 | |
---|
[2997] | 129 | "sitemap.xml" link should appear in "Settings" page of the |
---|
| 130 | Plone Google Sitemap configlet when "Content Sitemap" |
---|
| 131 | was added. |
---|
[2947] | 132 | |
---|
[2949] | 133 | >>> browser.open(portal_url + "/prefs_gsm_settings") |
---|
| 134 | >>> smedit_link = browser.getLink('sitemap.xml') |
---|
[2950] | 135 | >>> smedit_url = smedit_link.url |
---|
[2947] | 136 | |
---|
[2997] | 137 | This link points to edit form of the newly created sitemap.xml. |
---|
| 138 | Let prepare view link to simplifier following demonstrations. |
---|
[2947] | 139 | |
---|
[2950] | 140 | >>> smedit_url.endswith("sitemap.xml/edit") |
---|
[2949] | 141 | True |
---|
[2950] | 142 | >>> smview_url = smedit_url[:-5] |
---|
[2949] | 143 | |
---|
| 144 | |
---|
| 145 | No filters |
---|
| 146 | ========== |
---|
| 147 | |
---|
[2997] | 148 | Created sitemap has no filters and all documents should appear in it. |
---|
[2949] | 149 | |
---|
[2950] | 150 | >>> browser.open(smview_url) |
---|
[2949] | 151 | >>> file("/tmp/browser.1.html","wb").write(browser.contents) |
---|
| 152 | >>> no_filters_content = browser.contents |
---|
| 153 | |
---|
[2997] | 154 | Check if resulted page really is sitemap... |
---|
[2949] | 155 | |
---|
[2950] | 156 | >>> print browser.contents |
---|
| 157 | <?xml version="1.0" encoding=... |
---|
[2949] | 158 | |
---|
[2950] | 159 | |
---|
[2997] | 160 | Create regular expression, which help us to test which urls pass the filters. |
---|
[2949] | 161 | |
---|
| 162 | >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S) |
---|
| 163 | |
---|
[2997] | 164 | Test if all 4 documents and default front-page present in the sitemap |
---|
| 165 | without filters. |
---|
[2949] | 166 | |
---|
| 167 | >>> no_filters_res = reloc.findall(no_filters_content) |
---|
| 168 | >>> no_filters_res.sort() |
---|
| 169 | >>> print "\n".join(no_filters_res) |
---|
| 170 | /Members/test_user_1_/doc1 |
---|
| 171 | /Members/test_user_1_/doc2 |
---|
| 172 | /doc1 |
---|
| 173 | /doc2 |
---|
| 174 | /front-page |
---|
| 175 | |
---|
| 176 | |
---|
| 177 | Check "id" filter |
---|
| 178 | ================= |
---|
| 179 | |
---|
| 180 | Go to the edit form of the sitemap and add "doc1" |
---|
| 181 | and "front-page" lines with "id:" prefix to the |
---|
| 182 | "Blackout entries" field. |
---|
| 183 | |
---|
[2950] | 184 | >>> browser.open(smedit_url) |
---|
[2949] | 185 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
[2952] | 186 | >>> filtercontrol.value = """ |
---|
| 187 | ... id:doc1 |
---|
| 188 | ... id:front-page |
---|
| 189 | ... """ |
---|
[2949] | 190 | >>> browser.getControl("Save").click() |
---|
| 191 | >>> id_filter_content = browser.contents |
---|
| 192 | |
---|
[2997] | 193 | "doc1" and "front-page" documents should be excluded from the |
---|
| 194 | sitemap. |
---|
[2949] | 195 | |
---|
| 196 | >>> id_filter_res = reloc.findall(id_filter_content) |
---|
| 197 | >>> id_filter_res.sort() |
---|
| 198 | >>> print "\n".join(id_filter_res) |
---|
| 199 | /Members/test_user_1_/doc2 |
---|
| 200 | /doc2 |
---|
| 201 | |
---|
| 202 | |
---|
| 203 | Check "path" filter |
---|
| 204 | =================== |
---|
| 205 | |
---|
[2997] | 206 | Suppouse we wont to exclude the "front_page" from portal root |
---|
| 207 | and "doc2" document, located in test_user_1_ home folder, |
---|
| 208 | but leave untouched "doc2" in portal root with all other objects. |
---|
[2949] | 209 | |
---|
[2950] | 210 | >>> browser.open(smedit_url) |
---|
[2949] | 211 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
[2952] | 212 | >>> filtercontrol.value = """ |
---|
| 213 | ... path:/Members/test_user_1_/doc2 |
---|
| 214 | ... path:/front-page |
---|
| 215 | ... """ |
---|
[2949] | 216 | >>> browser.getControl("Save").click() |
---|
| 217 | >>> path_filter_content = browser.contents |
---|
| 218 | |
---|
[2997] | 219 | "/Members/test_user_1_/doc2" and "/front_page" objects should |
---|
| 220 | be excluded from the sitemap. |
---|
[2949] | 221 | |
---|
| 222 | >>> path_filter_res = reloc.findall(path_filter_content) |
---|
| 223 | >>> path_filter_res.sort() |
---|
| 224 | >>> print "\n".join(path_filter_res) |
---|
[3000] | 225 | /Members/test_user_1_/doc1 |
---|
[2949] | 226 | /doc1 |
---|
| 227 | /doc2 |
---|
| 228 | |
---|
| 229 | |
---|
| 230 | Check default filter |
---|
| 231 | ==================== |
---|
| 232 | |
---|
[2997] | 233 | Now I have the question: "What filter will be used when no |
---|
| 234 | filter name prefix was specified (old-fashion filters for |
---|
| 235 | example)?" |
---|
[2949] | 236 | |
---|
[2997] | 237 | Go to the edit form of the sitemap and add "doc1" and |
---|
| 238 | "front-page" lines without any filter name prefix to the |
---|
| 239 | "Blackout entries" field. |
---|
[2949] | 240 | |
---|
| 241 | >>> browser.open(portal_url + "/sitemap.xml/edit") |
---|
| 242 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
[2952] | 243 | >>> filtercontrol.value = """ |
---|
| 244 | ... doc1 |
---|
| 245 | ... front-page |
---|
| 246 | ... """ |
---|
[2949] | 247 | >>> browser.getControl("Save").click() |
---|
| 248 | >>> default_filter_content = browser.contents |
---|
| 249 | |
---|
[2997] | 250 | "id" filter must be used as default filter. So all "doc1" and |
---|
| 251 | "front-page" objects should be excluded from the sitemap. |
---|
[2949] | 252 | |
---|
| 253 | >>> default_filter_res = reloc.findall(default_filter_content) |
---|
| 254 | >>> default_filter_res.sort() |
---|
| 255 | >>> print "\n".join(default_filter_res) |
---|
| 256 | /Members/test_user_1_/doc2 |
---|
| 257 | /doc2 |
---|
| 258 | |
---|
| 259 | |
---|
[2951] | 260 | Creation own filters |
---|
| 261 | ==================== |
---|
| 262 | |
---|
| 263 | Suppouse we want to create own blackout filter, |
---|
[2997] | 264 | which behave like id-filter, but has some differencies. |
---|
[2951] | 265 | Our fitler has following format: |
---|
| 266 | |
---|
| 267 | (+|-)<filtered id> |
---|
| 268 | |
---|
[2997] | 269 | - when 1st sign is "+" then only objects with <filtered id> |
---|
| 270 | should be leaved in sitemap after filetering; |
---|
| 271 | - if 1st sign is "-" then all objects with <filtered id> |
---|
| 272 | should be excluded from the sitemap (like default id |
---|
| 273 | filter). |
---|
[2951] | 274 | |
---|
| 275 | You need create new IBlckoutFilter multi-adapter, |
---|
| 276 | and register it with unique name. |
---|
| 277 | |
---|
| 278 | >>> from zope.component import adapts |
---|
| 279 | >>> from zope.interface import Interface, implements |
---|
| 280 | >>> from zope.publisher.interfaces.browser import IBrowserRequest |
---|
| 281 | >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter |
---|
| 282 | >>> class SignedIdFilter(object): |
---|
| 283 | ... adapts(Interface, IBrowserRequest) |
---|
| 284 | ... implements(IBlackoutFilter) |
---|
| 285 | ... def __init__(self, context, request): |
---|
| 286 | ... self.context = context |
---|
| 287 | ... self.request = request |
---|
| 288 | ... def filterOut(self, fdata, fargs): |
---|
| 289 | ... sign = fargs[0] |
---|
| 290 | ... fid = fargs[1:] |
---|
| 291 | ... if sign == "+": |
---|
| 292 | ... return [b for b in fdata if b.getId==fid] |
---|
| 293 | ... elif sign == "-": |
---|
| 294 | ... return [b for b in fdata if b.getId!=fid] |
---|
| 295 | ... return fdata |
---|
| 296 | |
---|
| 297 | |
---|
| 298 | Now register this new filter as named multiadapter ... |
---|
| 299 | |
---|
| 300 | >>> from zope.component import provideAdapter |
---|
| 301 | >>> provideAdapter(SignedIdFilter, |
---|
| 302 | ... name=u'signedid') |
---|
| 303 | |
---|
[2997] | 304 | So that's all what needed to add new filter. |
---|
| 305 | Now test newly created filter. |
---|
[2951] | 306 | |
---|
[2997] | 307 | Check whether white filtering ("+" prefix) works correctly. |
---|
| 308 | Go to edit form of the sitemap and add "signedid:+doc1" |
---|
[2951] | 309 | to the "Blackout entries" field. |
---|
| 310 | |
---|
| 311 | >>> browser.open(smedit_url) |
---|
| 312 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
[2952] | 313 | >>> filtercontrol.value = """ |
---|
| 314 | ... signedid:+doc1 |
---|
| 315 | ... """ |
---|
[2951] | 316 | >>> browser.getControl("Save").click() |
---|
| 317 | >>> signedid_filter_content = browser.contents |
---|
| 318 | |
---|
[2997] | 319 | Only objects with "doc1" id should be leaved in the sitemap. |
---|
[2951] | 320 | |
---|
| 321 | >>> signedid_filter_res = reloc.findall(signedid_filter_content) |
---|
| 322 | >>> signedid_filter_res.sort() |
---|
| 323 | >>> print "\n".join(signedid_filter_res) |
---|
| 324 | /Members/test_user_1_/doc1 |
---|
| 325 | /doc1 |
---|
| 326 | |
---|
| 327 | |
---|
[2997] | 328 | And for the last - check wheter black filtering ("-" prefix) |
---|
| 329 | works correctly. |
---|
[2951] | 330 | Go to the edit form of the sitemap and add "signedid:-doc1" |
---|
| 331 | to the "Blackout entries" field. |
---|
| 332 | |
---|
| 333 | >>> browser.open(smedit_url) |
---|
| 334 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
[2952] | 335 | >>> filtercontrol.value = """ |
---|
| 336 | ... signedid:-doc1 |
---|
| 337 | ... """ |
---|
[2951] | 338 | >>> browser.getControl("Save").click() |
---|
| 339 | >>> signedid_filter_content = browser.contents |
---|
| 340 | |
---|
[2997] | 341 | All objects except those having "doc1" id must be included in |
---|
| 342 | the sitemap. |
---|
[2951] | 343 | |
---|
| 344 | >>> signedid_filter_res = reloc.findall(signedid_filter_content) |
---|
| 345 | >>> signedid_filter_res.sort() |
---|
| 346 | >>> print "\n".join(signedid_filter_res) |
---|
| 347 | /Members/test_user_1_/doc2 |
---|
| 348 | /doc2 |
---|
| 349 | /front-page |
---|