Pandora FMS community forums

Full Version: Problemas con el intérvalo
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hola,

Tengo una Ubuntu Server 10.10 con la versión 3.2.1 agente: Tengo problemas.
Tengo una Ubuntu Server 10.10 con la versión 3.2.1 y pandora_console y server.
Tengo una Ubuntu Server 8.04 con la versión 3.2.1 agente :  Todo funciona perfecto.

En la Ubuntu 10.10 tengo problemas, solo monitorizo la CPU para evitar "colapsos". El problema viene que le pongo un intérvalo de 30 segundos (pasa igual con 60 o 120). Cada 30 segundos tiene que enviar los datos pero no lo hace. Voy viendo como se ejecuta el proceso pero el servidor no recibe los datos seguidos, a veces tarda muchos minutos en enviar, a veces lo hace bien, un poco raro. He mirado los logs del servidor y en alguna entrada pone datos UNKNOWN y en otros veo que el cliente ni conecta, se queda rato sin recibir datos. ¿Alguna idea? LA misma configuración en el otro equipo funciona perfectamente y solo se diferencian en la versión. ¿?

Por otro lado después mirando, si pongo mas sensores y miro el registro, en algunos me ha pillado todos, en otros 2, otros 6, etc. O sea, que o pierde o simplemente si son iguales no los introduce. ¿Es posible? Me interesaría tener una lista seguida de los sensores.

Gracias.

No se por donde pillarlo la verdad. Adjunto datos de log y capturas.

Log del servidor :

2011-05-12 12:01:24 monitor [V5] Modified timestamp = 2011/05/12 12:01:22 with timezone_offset = 0
2011-05-12 12:01:24 monitor [V5] Updating agent router
2011-05-12 12:01:24 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:01:24 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:01:24 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:02:29 monitor [V5] Modified timestamp = 2011/05/12 12:02:24 with timezone_offset = 0
2011-05-12 12:02:29 monitor [V5] Updating agent router
2011-05-12 12:02:29 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:02:29 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:02:29 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:03:51 monitor [V10] Module cpu_user is going to UNKNOWN
^C
[email protected]:/home/projecte# tail -f /var/log/pandora/pandora_server.log
2011-05-12 12:01:24 monitor [V5] Updating agent router
2011-05-12 12:01:24 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:01:24 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:01:24 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:02:29 monitor [V5] Modified timestamp = 2011/05/12 12:02:24 with timezone_offset = 0
2011-05-12 12:02:29 monitor [V5] Updating agent router
2011-05-12 12:02:29 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:02:29 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:02:29 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:03:51 monitor [V10] Module cpu_user is going to UNKNOWN
2011-05-12 12:04:34 monitor [V5] Modified timestamp = 2011/05/12 12:04:29 with timezone_offset = 0
2011-05-12 12:04:34 monitor [V5] Updating agent router
2011-05-12 12:04:34 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:04:34 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:04:34 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:05:34 monitor [V5] Modified timestamp = 2011/05/12 12:05:32 with timezone_offset = 0
2011-05-12 12:05:34 monitor [V5] Updating agent router
2011-05-12 12:05:34 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:05:34 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:05:34 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:06:04 monitor [V5] Modified timestamp = 2011/05/12 12:06:03 with timezone_offset = 0
2011-05-12 12:06:04 monitor [V5] Updating agent router
2011-05-12 12:06:04 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:06:04 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:06:04 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:06:34 monitor [V5] Modified timestamp = 2011/05/12 12:06:34 with timezone_offset = 0
2011-05-12 12:06:34 monitor [V5] Updating agent router
2011-05-12 12:06:34 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:06:34 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:06:34 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:07:09 monitor [V5] Modified timestamp = 2011/05/12 12:07:06 with timezone_offset = 0
2011-05-12 12:07:09 monitor [V5] Updating agent router
2011-05-12 12:07:09 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:07:09 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:07:09 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:07:39 monitor [V5] Modified timestamp = 2011/05/12 12:07:37 with timezone_offset = 0
2011-05-12 12:07:39 monitor [V5] Updating agent router
2011-05-12 12:07:39 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:07:39 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:07:39 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:08:09 monitor [V5] Modified timestamp = 2011/05/12 12:08:08 with timezone_offset = 0
2011-05-12 12:08:09 monitor [V5] Updating agent router
2011-05-12 12:08:09 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:08:09 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:08:09 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:08:39 monitor [V5] Modified timestamp = 2011/05/12 12:08:39 with timezone_offset = 0
2011-05-12 12:08:39 monitor [V5] Updating agent router
2011-05-12 12:08:39 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:08:39 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:08:39 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:09:14 monitor [V5] Modified timestamp = 2011/05/12 12:09:10 with timezone_offset = 0
2011-05-12 12:09:14 monitor [V5] Updating agent router
2011-05-12 12:09:14 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:09:14 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:09:14 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:10:14 monitor [V5] Modified timestamp = 2011/05/12 12:10:13 with timezone_offset = 0
2011-05-12 12:10:14 monitor [V5] Updating agent router
2011-05-12 12:10:14 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:10:14 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:10:14 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:10:44 monitor [V5] Modified timestamp = 2011/05/12 12:10:44 with timezone_offset = 0
2011-05-12 12:10:44 monitor [V5] Updating agent router
2011-05-12 12:10:44 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:10:44 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:10:44 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:11:53 monitor [V10] Module cpu_user is going to UNKNOWN
2011-05-12 12:12:19 monitor [V5] Modified timestamp = 2011/05/12 12:12:18 with timezone_offset = 0
2011-05-12 12:12:20 monitor [V5] Updating agent router
2011-05-12 12:12:20 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:12:20 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:12:20 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:13:24 monitor [V10] Module cpu_user is going to UNKNOWN
2011-05-12 12:14:55 monitor [V5] Modified timestamp = 2011/05/12 12:14:55 with timezone_offset = 0
2011-05-12 12:14:55 monitor [V5] Updating agent router
2011-05-12 12:14:55 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:14:55 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:14:55 monitor [V10] Processing module 'cpu_user' for agent 'router'.
2011-05-12 12:15:30 monitor [V5] Modified timestamp = 2011/05/12 12:15:26 with timezone_offset = 0
2011-05-12 12:15:30 monitor [V5] Updating agent router
2011-05-12 12:15:30 monitor [V10] Agent id 36 positional data ignored (update_gis_data = 1)
2011-05-12 12:15:30 monitor [V10] Updating keep_alive module for agent 'router'.
2011-05-12 12:15:30 monitor [V10] Processing module 'cpu_user' for agent 'router'.

Log del cliente  :

2011/05/12 11:46:16 - [setup] - udp_server is 0
2011/05/12 11:46:16 - [setup] - udp_server_port is 41122
2011/05/12 11:46:16 - [setup] - udp_server_auth_address is 0.0.0.0
2011/05/12 11:46:16 - [setup] - group is Projecte
2011/05/12 11:46:16 - [setup] - server_port is 41121
2011/05/12 11:46:16 - [setup] - transfer_mode is tentacle
2011/05/12 11:46:16 - [setup] - pandora_user is root
2011/05/12 11:46:16 - [setup] - agent_threads is 2
2011/05/12 11:46:16 - [setup] - pandora_nice is -15
2011/05/12 11:46:16 - [setup] - xml_buffer is 1

Configuracion del agente  :


#Configurao agent pandora projecte.local
server_ip 192.168.0.3
server_path /var/spool/pandora/data_in
temporal /tmp
logfile /var/log/pandora/pandora_agent.log
interval  30
debug 0
udp_server 0
udp_server_port 41122
udp_server_auth_address 0.0.0.0
group Projecte
server_port 41121
transfer_mode tentacle
# User the agent will run as
pandora_user root
agent_threads 2
pandora_nice -15
xml_buffer 1

module_begin
module_name cpu_user
module_type generic_data_string
module_interval 1
module_exec vmstat 1 2 | tail -1 | awk '{ print $13 }'
module_max 100
module_min 0
module_description User CPU Usage (%)
module_end


Recollint les dades després es pot veure que no ho fa a 30 segons :

May 12, 2011, 12:23 pm 0 3 segundos
May 12, 2011, 12:23 pm 1 34 segundos
May 12, 2011, 12:22 pm 0 1:36 minutos
May 12, 2011, 12:21 pm 1 2:39 minutos
May 12, 2011, 12:20 pm 0 3:41 minutos
May 12, 2011, 12:19 pm 1 4:12 minutos
May 12, 2011, 12:18 pm 0 5:47 minutos
May 12, 2011, 12:15 pm 1 8:23 minutos
May 12, 2011, 12:10 pm 0 13:05 minutos
May 12, 2011, 12:10 pm 1 13:36 minutos
May 12, 2011, 12:09 pm 2 14:39 minutos
May 12, 2011, 12:08 pm 0 15:10 minutos
May 12, 2011, 12:08 pm 1 15:41 minutos
May 12, 2011, 12:07 pm 0 16:12 minutos
May 12, 2011, 12:07 pm 1 16:43 minutos
May 12, 2011, 12:05 pm 0 18:17 minutos
May 12, 2011, 12:04 pm 1 19:20 minutos
May 12, 2011, 12:00 pm 0 22:58 minutos
May 12, 2011, 11:58 am 1 25:35 minutos
May 12, 2011, 11:56 am 0 27:09 minutos

Hola,

He arreglado bastante el problema pasando a la transferencia por FTP, la verdad es que mucho pero no del todo. Otra ganancia has sido borrar toda la base de datos de agentes, etc a mano con lo que algunos mensajes de "Skipping Module..." han desparecido de los logs.

Mirando los logs, cuando todo parecía que funcionaba correctamente ,de repente un agente ha dejado de enviar datos durante un rato y he visto mensajes del estilo :

2011-05-12 18:36:42 monitor [V10] Module tcp_active is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_visitesweb is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_404 is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module proctotal is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module cpu_user is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_update is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_select is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_insert is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module disk_root_free is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module Load Average is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module FitxersOberts is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module sql_grant is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module NumportsTCPactius is going to UNKNOWN
2011-05-12 18:36:42 monitor [V10] Module NumPortsTCPescoltant is going to UNKNOW

Seguiremos investigando
Hola Kitus,

Dos cosillas a modo de comentario. Lo del Unknown te aparece porque en el servidor no se han recibido datos desde hace mas del doble del tiempo configurado en el agente.

Otra cosilla es sobre el parametro pandora_nice. Lo tienes en -15. Has probado de ponerle algun valor positivo o 0 ? que no sea que al ser negativo este haciendo un comportamiento diferente.

seguimos por aqui, saludos