Error Continous Replications 5.18.0



  • Hi there, I'm testing the XO xo-server 5.18.0 server and web.

    tested on a XenServer 7.1 and on a XCP-ng 7.4

    with some machines I have this error.

    In this case I have 2 cloned machines from the same template.

    by the way, the machines that are correct done, they have a wrong time value, its 2 hours less than the current time (minor issue)

     [{"message":"VDI_IO_ERROR(Device I/O errors)","stack":"XapiError: VDI_IO_ERROR(Device I/O errors)\n at wrapError (/opt/xen-orchestra/packages/xen-api/src/index.js:111:9)\n at getTaskResult (/opt/xen-orchestra/packages/xen-api/src/index.js:189:22)\n at Xapi._addObject (/opt/xen-orchestra/packages/xen-api/src/index.js:796:8)\n at /opt/xen-orchestra/packages/xen-api/src/index.js:832:13\n at arrayEach (/opt/xen-orchestra/node_modules/lodash/_arrayEach.js:15:9)\n at forEach (/opt/xen-orchestra/node_modules/lodash/forEach.js:38:10)\n at Xapi._processEvents (/opt/xen-orchestra/packages/xen-api/src/index.js:827:12)\n at onSuccess (/opt/xen-orchestra/packages/xen-api/src/index.js:850:11)\n at run (/opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:66:22)\n at /opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:79:30\n at flush (/opt/xen-orchestra/node_modules/core-js/modules/_microtask.js:18:9)\n at process._tickCallback (internal/process/next_tick.js:112:11)","code":"VDI_IO_ERROR","params":["Device I/O errors"],"url":"https://192.168.222.230/import_raw_vdi/?format=vhd&vdi=OpaqueRef%3A033ee0fa-8524-4721-bd85-e2ea6f70fb0f&session_id=OpaqueRef%3Afbdb6310-4250-4689-b642-2cd270b42bad&task_id=OpaqueRef%3A56ab4936-f673-4351-a7b3-a7d9a37c31be"}]
    


  • Are you on XOA or installed from the sources?



  • hi, its a XO 5.18
    also in this post we're talking about the same problem https://xen-orchestra.com/forum/topic/633/job-canceled-to-protect-the-vdi-chain/5

    if you prefert to delete this post and continue on the other one its fine for me.



  • No keep it here and close the other one. The other is not in the right section πŸ™‚



  • OK. Thanks.

    I've been doing more tests.

    • Develop pool. XCP-ng 7.4
    • I created a new XO (from sources) in the develop pool.
    • Created a new VM and created a Continuous Replications between the same SR.

    I got the "interrupted" error, it stopped without no reason.

    repeated the operation, and now it has worked. :?

    • Production Pool. XenServer 7.1. Fast SR, the same as above.
      CR with 1 VM from storage to same storage (HBA, SAS discs)
      First try : Error.
      Retried Error :
    backup-ng:
    VM: x-debian9-testXO-SYN (Dell-pool)
    tag: CR-debian9-testXO-SYN
    Start: Apr 6, 2018, 1:10:36 PM
    [{"message":"VDI_IO_ERROR(Device I/O errors)","stack":"XapiError: VDI_IO_ERROR(Device I/O errors)\n at wrapError (/opt/xen-orchestra/packages/xen-api/src/index.js:111:9)\n at getTaskResult (/opt/xen-orchestra/packages/xen-api/src/index.js:189:22)\n at Xapi._addObject (/opt/xen-orchestra/packages/xen-api/src/index.js:796:8)\n at /opt/xen-orchestra/packages/xen-api/src/index.js:832:13\n at arrayEach (/opt/xen-orchestra/node_modules/lodash/_arrayEach.js:15:9)\n at forEach (/opt/xen-orchestra/node_modules/lodash/forEach.js:38:10)\n at Xapi._processEvents (/opt/xen-orchestra/packages/xen-api/src/index.js:827:12)\n at onSuccess (/opt/xen-orchestra/packages/xen-api/src/index.js:850:11)\n at run (/opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:66:22)\n at /opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:79:30\n at flush (/opt/xen-orchestra/node_modules/core-js/modules/_microtask.js:18:9)\n at process._tickCallback (internal/process/next_tick.js:112:11)","code":"VDI_IO_ERROR","params":["Device I/O errors"],"url":"https://192.168.222.13/import_raw_vdi/?format=vhd&vdi=OpaqueRef%3Ab152af18-19ba-90ef-60d2-8f417045427b&session_id=OpaqueRef%3Af156389c-f9c0-425c-b870-5b7060febd12&task_id=OpaqueRef%3A10f3e3f3-4375-9fbd-ce76-2cec6dcfa3a0"}]
    

    on the Production Pool. Slow SR
    the same as above.
    CR with 1 VM from storage to same storage (NFS, SATA discs)
    First try : Error.
    Retried : Error

        backup-ng:
        VM: x-debian9-testXO-SYN (Dell-pool)
        tag: CR-debian9-testXO-SYN
        Start: Apr 6, 2018, 1:13:02 PM
        [{"message":"VDI_IO_ERROR(Device I/O errors)","stack":"XapiError: VDI_IO_ERROR(Device I/O errors)\n at wrapError (/opt/xen-orchestra/packages/xen-api/src/index.js:111:9)\n at getTaskResult (/opt/xen-orchestra/packages/xen-api/src/index.js:189:22)\n at Xapi._addObject (/opt/xen-orchestra/packages/xen-api/src/index.js:796:8)\n at /opt/xen-orchestra/packages/xen-api/src/index.js:832:13\n at arrayEach (/opt/xen-orchestra/node_modules/lodash/_arrayEach.js:15:9)\n at forEach (/opt/xen-orchestra/node_modules/lodash/forEach.js:38:10)\n at Xapi._processEvents (/opt/xen-orchestra/packages/xen-api/src/index.js:827:12)\n at onSuccess (/opt/xen-orchestra/packages/xen-api/src/index.js:850:11)\n at run (/opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:66:22)\n at /opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:79:30\n at flush (/opt/xen-orchestra/node_modules/core-js/modules/_microtask.js:18:9)\n at process._tickCallback (internal/process/next_tick.js:112:11)","code":"VDI_IO_ERROR","params":["Device I/O errors"],"url":"https://192.168.222.13/import_raw_vdi/?format=vhd&vdi=OpaqueRef%3A8aab36cf-beab-2615-bc56-89696815536c&session_id=OpaqueRef%3A6d7ae4ce-0524-b85e-1fe4-16f40ff30e34&task_id=OpaqueRef%3A30d973ed-84ff-05a3-127b-1bb95f904047"}]
    

    by the way, I don't know if its the same problem, but yestarday I tried to migrate machines from one pool to another pool and it was also unsuccessful.
    Was successful if firts i stopped the VM, copied and started in the other pool.



  • Are you using backup-ng or legacy backup?

    edit: can you try with legacy backup?



  • @olivierlambert said in Error Continous Replications 5.18.0:

    legacy backup

    hi i was performing another test while you answered.

    I tried another CR from de development pool (XCP-ng 7.4) fron de SR "slow" NFS to the same SR
    First try : Error
    Second try : successful.

    at this moment I'm guessing that something its not right with XenServer 7.1

    ** now I'm going to try legacy, but I already checked with backup-ng the normal backup (no incremental) and this kind of backup works fine.



  • When you said "error" on first try, would you mind telling which error?

    Also, to be sure it's not do to the new code of backup-ng, I'd like you to test CR job created in backup section, NOT backup-ng.



  • hi!
    First try on Production Pool XS 7.1, for example this.

    [{"message":"VDI_IO_ERROR(Device I/O errors)","stack":"XapiError: VDI_IO_ERROR(Device I/O errors)\n at wrapError (/opt/xen-orchestra/packages/xen-api/src/index.js:111:9)\n at getTaskResult (/opt/xen-orchestra/packages/xen-api/src/index.js:189:22)\n at Xapi._addObject (/opt/xen-orchestra/packages/xen-api/src/index.js:796:8)\n at /opt/xen-orchestra/packages/xen-api/src/index.js:832:13\n at arrayEach (/opt/xen-orchestra/node_modules/lodash/_arrayEach.js:15:9)\n at forEach (/opt/xen-orchestra/node_modules/lodash/forEach.js:38:10)\n at Xapi._processEvents (/opt/xen-orchestra/packages/xen-api/src/index.js:827:12)\n at onSuccess (/opt/xen-orchestra/packages/xen-api/src/index.js:850:11)\n at run (/opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:66:22)\n at /opt/xen-orchestra/node_modules/core-js/modules/es6.promise.js:79:30\n at flush (/opt/xen-orchestra/node_modules/core-js/modules/_microtask.js:18:9)\n at process._tickCallback (internal/process/next_tick.js:112:11)","code":"VDI_IO_ERROR","params":["Device I/O errors"],"url":"https://192.168.222.202/import_raw_vdi/?format=vhd&vdi=OpaqueRef%3A975806dd-0512-48d3-a40b-ac1e69b25778&session_id=OpaqueRef%3A1703e4db-8679-4ec1-a203-afda1b7128c0&task_id=OpaqueRef%3Ad03563a8-1332-4fcc-b036-95e2d1cf5a85"}]

    First try on develop pool
    the error is "interrupted"

    Now I'm trying Backup-legacy, new a few minutes πŸ™‚



  • Well tried again with backup legacy

    none of them has worked.

    Production Pool. (inside the same SR)
    at first I received a time out error, changed the value to 120 seconds and now the error is " VDI_IO_ERROR(Device I/O errors)"
    as much I retried

    Development pool (inside the same SR)
    I'm receiving a "interrupted" error every time I try.



  • Something that fails almost everytime would have been detected if reproducible πŸ˜•

    I suspect a XenServer issue, please check your XS logs.



  • but the backup-legacy also fails with XCP-ng 7.4

    where can I see this logs ?



  • XenServer and XCP-ng are 99.99% similar.

    edit: in /var/log/xensource.log and /var/log/SMlog



  • @olivierlambert said in Error Continous Replications 5.18.0:

    /var/log/xensource.log

    thanks.

    do you want me to copy & paste it here ?



  • I'm pretty busy today, and the log could be really big πŸ˜• Try to pinpoint evens that's time related to errors you saw in XO.



  • yes, you're right.
    well I think I catch something

    this is a develop server while doing a backup-legacy, when the backup stoped at 42% more or less

    Apr  6 15:26:34 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] Processing warnings
    Apr  6 15:26:34 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] Done with warnings
    Apr  6 15:26:34 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] The node we think is the master is still alive and marked as master; this is OK
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] session.login_with_password D:1f68466fceda failed with exception Server_error(HOST_IS_SLAVE, [ 192.168.222.230 ])
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] Raised Server_error(HOST_IS_SLAVE, [ 192.168.222.230 ])
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 1/8 xapi @ XCP1 Raised at file ocaml/xapi/xapi_session.ml, line 383
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 2/8 xapi @ XCP1 Called from file ocaml/xapi/xapi_session.ml, line 39
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 3/8 xapi @ XCP1 Called from file ocaml/xapi/xapi_session.ml, line 39
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 4/8 xapi @ XCP1 Called from file ocaml/xapi/server_helpers.ml, line 69
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 5/8 xapi @ XCP1 Called from file ocaml/xapi/server_helpers.ml, line 91
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 6/8 xapi @ XCP1 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 7/8 xapi @ XCP1 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 26
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace] 8/8 xapi @ XCP1 Called from file lib/backtrace.ml, line 177
    Apr  6 15:26:42 XCP1 xapi: [error|XCP1|4262 INET :::80|dispatch:session.login_with_password D:c8f25c93214a|backtrace]
    Apr  6 15:26:47 XCP1 xapi: [ info|XCP1|4263 UNIX /var/lib/xcp/xapi||cli] xe host-ha-xapi-healthcheck username=root password=(omitted)
    Apr  6 15:26:47 XCP1 xapi: [debug|XCP1|4263 UNIX /var/lib/xcp/xapi|session.slave_local_login_with_password D:822808a40866|xapi] Add session to local storage
    Apr  6 15:26:49 XCP1 xapi: [debug|XCP1|4261 ||xmlrpc_client] stunnel pid: 18843 (cached = true) returned stunnel to cache
    Apr  6 15:26:49 XCP1 xapi: [debug|XCP1|4264 ||mscgen] xapi=>xapi [label="event.from"];
    Apr  6 15:26:49 XCP1 xapi: [debug|XCP1|4264 ||xmlrpc_client] stunnel pid: 14677 (cached = true) connected to 192.168.222.230:443
    Apr  6 15:26:49 XCP1 xapi: [debug|XCP1|4264 ||xmlrpc_client] with_recorded_stunnelpid task_opt=None s_pid=14677
    Apr  6 15:26:54 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] Liveset: online 11d55d91-2e0d-4008-803d-42db55ca4cf7 [ L  A ]; 55185571-6dfa-429a-9415-2dacb9ff1f3a [*L  A ]; a16383be-39c6-4bba-8709-75653eacd759 [ LM A ];
    Apr  6 15:26:54 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] Processing warnings
    Apr  6 15:26:54 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] Done with warnings
    Apr  6 15:26:54 XCP1 xapi: [debug|XCP1|6 ha_monitor|HA monitor D:bc6ea1becaa8|xapi_ha] The node we think is the master is still alive and marked as master; this is OK
    

    copy and paste on another editor, big screen is more legible.

    and now I can see this on tasks
    0_1523017912310_SelecciΓ³_054.png



  • It seems you have issue with HA, can you disable it?



  • yes, of course
    on develop pool
    xe pool-ha-disable

    tried again the backup Continuous replication backup-legacy.

    received the same error "interrupted"

    the log:

    Apr  6 15:56:17 XCP1 xapi: [debug|XCP1|4367 INET :::80|handler:http/rrd_updates D:0b2544c2a79f|xmlrpc_client] stunnel pid: 16092 (cached = true) connected to 192.168.222.230:443
    Apr  6 15:56:17 XCP1 xapi: [debug|XCP1|4367 INET :::80|handler:http/rrd_updates D:0b2544c2a79f|xmlrpc_client] with_recorded_stunnelpid task_opt=None s_pid=16092
    Apr  6 15:56:17 XCP1 xapi: [debug|XCP1|4367 INET :::80|handler:http/rrd_updates D:0b2544c2a79f|xmlrpc_client] stunnel pid: 16092 (cached = true) returned stunnel to cache
    Apr  6 15:56:17 XCP1 xapi: [debug|XCP1|4367 INET :::80|Get RRD updates. D:df56aef76915|xapi] hand_over_connection GET /rrd_updates to /var/lib/xcp/xcp-rrdd.forwarded
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] session.login_with_password D:bd4561f87253 failed with exception Server_error(HOST_IS_SLAVE, [ 192.168.222.230 ])
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] Raised Server_error(HOST_IS_SLAVE, [ 192.168.222.230 ])
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 1/8 xapi @ XCP1 Raised at file ocaml/xapi/xapi_session.ml, line 383
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 2/8 xapi @ XCP1 Called from file ocaml/xapi/xapi_session.ml, line 39
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 3/8 xapi @ XCP1 Called from file ocaml/xapi/xapi_session.ml, line 39
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 4/8 xapi @ XCP1 Called from file ocaml/xapi/server_helpers.ml, line 69
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 5/8 xapi @ XCP1 Called from file ocaml/xapi/server_helpers.ml, line 91
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 6/8 xapi @ XCP1 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 22
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 7/8 xapi @ XCP1 Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 26
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace] 8/8 xapi @ XCP1 Called from file lib/backtrace.ml, line 177
    Apr  6 15:56:32 XCP1 xapi: [error|XCP1|4368 INET :::80|dispatch:session.login_with_password D:4f8aeaec976e|backtrace]
    Apr  6 15:56:35 XCP1 xapi: [debug|XCP1|63 heartbeat|Heartbeat D:b70ceab1b744|mscgen] xapi=>xapi [label="host.tickle_heartbeat"];
    Apr  6 15:56:35 XCP1 xapi: [debug|XCP1|63 heartbeat|Heartbeat D:b70ceab1b744|stunnel] stunnel start
    Apr  6 15:56:35 XCP1 xapi: [debug|XCP1|63 heartbeat|Heartbeat D:b70ceab1b744|xmlrpc_client] stunnel pid: 29155 (cached = false) connected to 192.168.222.230:443
    Apr  6 15:56:35 XCP1 xapi: [debug|XCP1|63 heartbeat|Heartbeat D:b70ceab1b744|xmlrpc_client] with_recorded_stunnelpid task_opt=None s_pid=29155
    


  • Please use three backtick around your text, otherwise it's hard to read it



  • Sorry. Didn't know that. Corrected



  • Can you double check that you connect to this pool with only one server in XOA?


Log in to reply