hh.sePublications
Change search
Refine search result
1 - 20 of 20
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Aein, Mohamad Javad
    et al.
    Department for Computational Neuroscience at the Bernstein Center Göttingen (Inst. of Physics 3) & Leibniz Science Campus for Primate Cognition, Georg-August-Universität Göttingen, Göttingen, Germany.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Wörgötter, Florentin
    Department for Computational Neuroscience at the Bernstein Center Göttingen (Inst. of Physics 3) & Leibniz Science Campus for Primate Cognition, Georg-August-Universität Göttingen, Göttingen, Germany.
    Library of actions: Implementing a generic robot execution framework by using manipulation action semantics2019In: The international journal of robotics research, ISSN 0278-3649, E-ISSN 1741-3176, Vol. 38, no 8, p. 910-934Article in journal (Refereed)
    Abstract [en]

    Drive-thru-Internet is a scenario in cooperative intelligent transportation systems (C-ITSs), where a road-side unit (RSU) provides multimedia services to vehicles that pass by. Performance of the drive-thru-Internet depends on various factors, including data traffic intensity, vehicle traffic density, and radio-link quality within the coverage area of the RSU, and must be evaluated at the stage of system design in order to fulfill the quality-of-service requirements of the customers in C-ITS. In this paper, we present an analytical framework that models downlink traffic in a drive-thru-Internet scenario by means of a multidimensional Markov process: the packet arrivals in the RSU buffer constitute Poisson processes and the transmission times are exponentially distributed. Taking into account the state space explosion problem associated with multidimensional Markov processes, we use iterative perturbation techniques to calculate the stationary distribution of the Markov chain. Our numerical results reveal that the proposed approach yields accurate estimates of various performance metrics, such as the mean queue content and the mean packet delay for a wide range of workloads. © 2019 IEEE.

  • 2.
    Ak, Abdullah Cihan
    et al.
    Istanbul Technical University, Istanbul, Turkey.
    Aksoy, Eren
    Halmstad University, School of Information Technology.
    Sariel, Sanem
    Istanbul Technical University, Istanbul, Turkey.
    Learning Failure Prevention Skills for Safe Robot Manipulation2023In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 8, no 12, p. 7994-8001Article in journal (Refereed)
    Abstract [en]

    Robots are more capable of achieving manipulation tasks for everyday activities than before. However, the safety of manipulation skills that robots employ is still an open problem. Considering all possible failures during skill learning increases the complexity of the process and restrains learning an optimal policy. Nonetheless, safety-focused modularity in the acquisition of skills has not been adequately addressed in previous works. For that purpose, we reformulate skills as base and failure prevention skills, where base skills aim at completing tasks and failure prevention skills aim at reducing the risk of failures to occur. Then, we propose a modular and hierarchical method for safe robot manipulation by augmenting base skills by learning failure prevention skills with reinforcement learning and forming a skill library to address different safety risks. Furthermore, a skill selection policy that considers estimated risks is used for the robot to select the best control policy for safe manipulation. Our experiments show that the proposed method achieves the given goal while ensuring safety by preventing failures. We also show that with the proposed method, skill learning is feasible and our safe manipulation tools can be transferred to the real environment © 2023 IEEE

  • 3.
    Aksoy, Eren
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Baci, Saimir
    Volvo Technology AB, Volvo Group Trucks Technology, Vehicle Automation, Gothenburg, Sweden.
    Cavdar, Selcuk
    Volvo Technology AB, Volvo Group Trucks Technology, Vehicle Automation, Gothenburg, Sweden.
    SalsaNet: Fast Road and Vehicle Segmentationin LiDAR Point Clouds for Autonomous Driving2020In: IEEE Intelligent Vehicles Symposium: IV2020, Piscataway, N.J.: IEEE, 2020, p. 926-932Conference paper (Refereed)
    Abstract [en]

    In this paper, we introduce a deep encoder-decoder network, named SalsaNet, for efficient semantic segmentation of 3D LiDAR point clouds. SalsaNet segments the road, i.e. drivable free-space, and vehicles in the scene by employing the Bird-Eye-View (BEV) image projection of the point cloud. To overcome the lack of annotated point cloud data, in particular for the road segments, we introduce an auto-labeling process which transfers automatically generated labels from the camera to LiDAR. We also explore the role of imagelike projection of LiDAR data in semantic segmentation by comparing BEV with spherical-front-view projection and show that SalsaNet is projection-agnostic. We perform quantitative and qualitative evaluations on the KITTI dataset, which demonstrate that the proposed SalsaNet outperforms other state-of-the-art semantic segmentation networks in terms of accuracy and computation time. Our code and data are publicly available at https://gitlab.com/aksoyeren/salsanet.git. 

    Download full text (pdf)
    SalsaNet
  • 4.
    Akyol, Gamze
    et al.
    Artificial Intelligence and Robotics Laboratory, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Turkey.
    Sariel, Sanem
    Artificial Intelligence and Robotics Laboratory, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Turkey.
    Aksoy, Eren Erdal
    Halmstad University, School of Information Technology.
    A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction2021Conference paper (Refereed)
    Abstract [en]

    Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data.

    Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.

  • 5.
    Cooney, Martin
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent Systems´ laboratory. Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Orand, Abbas
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Larsson, Hanna
    Halmstad University.
    Pihl, Jacob
    Halmstad University.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Exercising with an “Iron Man”: Design for a Robot Exercise Coach for Persons with Dementia2020In: 29th IEEE International Conference on Robot and Human Interactive Communication, Piscataway: Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 899-905Conference paper (Refereed)
    Abstract [en]

    Socially assistive robots are increasingly being designed to interact with humans in various therapeutical scenarios. We believe that one useful scenario is providing exercise coaching for Persons with Dementia (PWD), which involves unique challenges related to memory and communication. We present a design for a robot that can seek to help a PWD to conduct exercises by recognizing their behaviors and providing appropriate feedback, in an online, multimodal, and engaging way. Additionally, following a mid-fidelity prototyping approach, we report on some observations from an exploratory user study using a Baxter robot; although limited by the sample size and our simplified approach, the results suggested the usefulness of the general scenario, and that the degree to which a robot provides feedback–occasional or continuous– could moderate impressions of attentiveness or fun. Some possibilities for future improvement are outlined, touching on richer recognition and behavior generation strategies based on deep learning and haptic feedback, toward informing next designs. © 2020 IEEE.

  • 6.
    Cortinhal, Tiago
    et al.
    Halmstad University, School of Information Technology.
    Aksoy, Eren
    Halmstad University, School of Information Technology.
    Depth- and semantics-aware multi-modal domain translation: Generating 3D panoramic color images from LiDAR point clouds2024In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 171, p. 1-9, article id 104583Article in journal (Refereed)
    Abstract [en]

    This work presents a new depth-and semantics-aware conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors. The proposed model leverages scene semantics as a mid-level representation and is able to translate raw LiDAR point clouds to RGB-D camera images by solely relying on semantic scene segments. We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain. The proposed model is evaluated on the large-scale and challenging Semantic-KITTI dataset, and experimental findings show that it considerably outperforms the original TITAN-Net and other strong baselines by 23.7% margin in terms of IoU. © 2023 The Author(s). 

  • 7.
    Cortinhal, Tiago
    et al.
    Halmstad University, School of Information Technology.
    Gouigah, Idriss
    Halmstad University, School of Information Technology.
    Aksoy, Eren
    Halmstad University, School of Information Technology.
    Semantics-aware LiDAR-Only Pseudo Point Cloud Generation for 3D Object Detection2024Conference paper (Refereed)
    Abstract [en]

    Although LiDAR sensors are crucial for autonomous systems due to providing precise depth information, they struggle with capturing fine object details, especially at a distance, due to sparse and non-uniform data. Recent advances introduced pseudo-LiDAR, i.e., synthetic dense point clouds, using additional modalities such as cameras to enhance 3D object detection. We present a novel LiDAR-only framework that augments raw scans with denser pseudo point clouds by solely relying on LiDAR sensors and scene semantics, omitting the need for cameras. Our framework first utilizes a segmentation model to extract scene semantics from raw point clouds, and then employs a multi-modal domain translator to generate synthetic image segments and depth cues without real cameras. This yields a dense pseudo point cloud enriched with semantic information. We also introduce a new semantically guided projection method, which enhances detection performance by retaining only relevant pseudo points. We applied our framework to different advanced 3D object detection methods and reported up to 2.9% performance upgrade. We also obtained comparable results on the KITTI 3D object detection dataset, in contrast to other state-of-the-art LiDAR-only detectors. 

  • 8.
    Cortinhal, Tiago
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Kurnaz, Fatih
    Middle East Technical Univetsity, Ankara, Turkey.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Semantics-aware Multi-modal Domain Translation: From LiDAR Point Clouds to Panoramic Color Images2021In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Los Alamitos: IEEE Computer Society, 2021, p. 3032-3041Conference paper (Refereed)
    Abstract [en]

    In this work, we present a simple yet effective framework to address the domain translation problem between different sensor modalities with unique data formats. By relying only on the semantics of the scene, our modular generative framework can, for the first time, synthesize a panoramic color image from a given full 3D LiDAR point cloud. The framework starts with semantic segmentation of the point cloud, which is initially projected onto a spherical surface. The same semantic segmentation is applied to the corresponding camera image. Next, our new conditional generative model adversarially learns to translate the predicted LiDAR segment maps to the camera image counterparts. Finally, generated image segments are processed to render the panoramic scene images. We provide a thorough quantitative evaluation on the SemanticKitti dataset and show that our proposed framework outperforms other strong baseline models. Our source code is available at https://github. com/halmstad-University/TITAN-NET. © 2021 IEEE.

    Download full text (pdf)
    fulltext
  • 9.
    Cortinhal, Tiago
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Tzelepi, George
    Volvo Technology AB, Volvo Group Trucks Technology, Gothenburg, Sweden.
    Erdal Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES). Volvo Technology AB, Volvo Group Trucks Technology, Gothenburg, Sweden.
    SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving2021In: Advances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II / [ed] Bebis, G., Yin, Z., Kim, E., Bender, J., Subr, K., Kwon, B.C., Zhao, J., Kalkofen, D., Baciu, G., Cham: Springer, 2021, Vol. 12510, p. 207-222Conference paper (Refereed)
    Abstract [en]

    In this paper, we introduce SalsaNext for the uncertainty-aware semantic segmentation of a full 3D LiDAR point cloud in real-time. SalsaNext is the next version of SalsaNet which has an encoder-decoder architecture where the encoder unit has a set of ResNet blocks and the decoder part combines upsampled features from the residual blocks. In contrast to SalsaNet, we introduce a new context module, replace the ResNet encoder blocks with a new residual dilated convolution stack with gradually increasing receptive fields and add the pixel-shuffle layer in the decoder. Additionally, we switch from stride convolution to average pooling and also apply central dropout treatment. To directly optimize the Jaccard index, we further combine the weighted cross entropy loss with Lovász-Softmax loss. We finally inject a Bayesian treatment to compute the epistemic and aleatoric uncertainties for each point in the cloud. We provide a thorough quantitative evaluation on the Semantic-KITTI dataset, which demonstrates that the proposed SalsaNext outperforms other published semantic segmentation networks and achieves 3.6% more accuracy over the previous state-of-the-art method. We also release our source code1. © 2020, Springer Nature Switzerland AG.

    [1] https://github.com/TiagoCortinhal/SalsaNext

    Download full text (pdf)
    fulltext
  • 10.
    Englund, Cristofer
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research. RISE Research Institutes of Sweden, Göteborg, Sweden.
    Erdal Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Alonso-Fernandez, Fernando
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Cooney, Martin Daniel
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Pashami, Sepideh
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research. RISE Research Institutes of Sweden, Göteborg, Sweden.
    Åstrand, Björn
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    AI Perspectives in Smart Cities and Communities to Enable Road Vehicle Automation and Smart Traffic Control2021In: Smart Cities, E-ISSN 2624-6511, Vol. 4, no 2, p. 783-802Article in journal (Refereed)
    Abstract [en]

    Smart Cities and Communities (SCC) constitute a new paradigm in urban development. SCC ideates on a data-centered society aiming at improving efficiency by automating and optimizing activities and utilities. Information and communication technology along with internet of things enables data collection and with the help of artificial intelligence (AI) situation awareness can be obtained to feed the SCC actors with enriched knowledge. This paper describes AI perspectives in SCC and gives an overview of AI-based technologies used in traffic to enable road vehicle automation and smart traffic control. Perception, Smart Traffic Control and Driver Modelling are described along with open research challenges and standardization to help introduce advanced driver assistance systems and automated vehicle functionality in traffic. To fully realize the potential of SCC, to create a holistic view on a city level, the availability of data from different stakeholders is need. Further, though AI technologies provide accurate predictions and classifications there is an ambiguity regarding the correctness of their outputs. This can make it difficult for the human operator to trust the system. Today there are no methods that can be used to match function requirements with the level of detail in data annotation in order to train an accurate model. Another challenge related to trust is explainability, while the models have difficulties explaining how they come to a certain conclusions it is difficult for humans to trust it. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.

  • 11.
    Inceoglu, Arda
    et al.
    Artificial Intelligence and Robotics Laboratory, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Turkey.
    Aksoy, Eren Erdal
    Halmstad University, School of Information Technology.
    Ak, Abdullah Cihan
    Artificial Intelligence and Robotics Laboratory, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Turkey.
    Sariel, Sanem
    Artificial Intelligence and Robotics Laboratory, Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Turkey.
    FINO-Net: A Deep Multimodal Sensor Fusion Framework for Manipulation Failure Detection2021In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, p. 6841-6847Conference paper (Refereed)
    Abstract [en]

    We need robots more aware of the unintended outcomes of their actions for ensuring safety. This can be achieved by an onboard failure detection system to monitor and detect such cases. Onboard failure detection is challenging with a limited set of onboard sensor setup due to the limitations of sensing capabilities of each sensor. To alleviate these challenges, we propose FINO-Net, a novel multimodal sensor fusion based deep neural network to detect and identify manipulation failures. We also introduce FAILURE, a multimodal dataset, containing 229 real-world manipulation data recorded with a Baxter robot. Our network combines RGB, depth and audio readings to effectively detect failures. Results indicate that fusing RGB with depth and audio modalities significantly improves the performance. FINO-Net achieves %98.60 detection accuracy on our novel dataset. Code and data are publicly available at https://github.com/ardai/fino-net.

  • 12.
    Inceoglu, Arda
    et al.
    Istanbul Technical University, Maslak, Turkey.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).
    Sariel, Sanem
    Istanbul Technical University, Maslak, Turkey.
    Multimodal Detection and Classification of Robot Manipulation Failures2024In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 9, no 2, p. 1396-1403Article in journal (Refereed)
    Abstract [en]

    An autonomous service robot should be able to interact with its environment safely and robustly without requiring human assistance. Unstructured environments are challenging for robots since the exact prediction of outcomes is not always possible. Even when the robot behaviors are well-designed, the unpredictable nature of the physical robot-object interaction may lead to failures in object manipulation. In this letter, we focus on detecting and classifying both manipulation and post-manipulation phase failures using the same exteroception setup. We cover a diverse set of failure types for primary tabletop manipulation actions. In order to detect these failures, we propose FINO-Net (Inceoglu et al., 2021), a deep multimodal sensor fusion-based classifier network architecture. FINO-Net accurately detects and classifies failures from raw sensory data without any additional information on task description and scene state. In this work, we use our extended FAILURE dataset (Inceoglu et al., 2021) with 99 new multimodal manipulation recordings and annotate them with their corresponding failure types. FINO-Net achieves 0.87 failure detection and 0.80 failure classification F1 scores. Experimental results show that FINO-Net is also appropriate for real-time use. © 2016 IEEE.

  • 13.
    Nowaczyk, Sławomir
    et al.
    Halmstad University, School of Information Technology.
    Resmini, Andrea
    Halmstad University, School of Information Technology.
    Long, Vicky
    Halmstad University, School of Business, Innovation and Sustainability.
    Fors, Vaike
    Halmstad University, School of Information Technology.
    Cooney, Martin
    Halmstad University, School of Information Technology.
    Duarte, Eduardo K.
    Pink, Sarah
    Monash University, Melbourne, Australia.
    Aksoy, Eren Erdal
    Halmstad University, School of Information Technology.
    Vinel, Alexey
    Halmstad University, School of Information Technology.
    Dougherty, Mark
    Halmstad University, School of Information Technology.
    Smaller is smarter: A case for small to medium-sized smart cities2022In: Journal of Smart Cities and Society, ISSN 2772-3577, Vol. 1, no 2, p. 95-117Article in journal (Refereed)
    Abstract [en]

    Smart Cities have been around as a concept for quite some time. However, most examples of Smart Cities (SCs) originate from megacities (MCs), despite the fact that most people live in Small and Medium-sized Cities (SMCs). This paper addresses the contextual setting for smart cities from the perspective of such small and medium-sized cities. It starts with an overview of the current trends in the research and development of SCs, highlighting the current bias and the challenges it brings. We follow with a few concrete examples of projects which introduced some form of “smartness” in the small and medium cities context, explaining what influence said context had and what specific effects did it lead to. Building on those experiences, we summarise the current understanding of Smart Cities, with a focus on its multi-faceted (e.g., smart economy, smart people, smart governance, smart mobility, smart environment and smart living) nature; we describe mainstream publications and highlight the bias towards large and very large cities (sometimes even subconscious); give examples of (often implicit) assumptions deriving from this bias; finally, we define the need of contextualising SCs also for small and medium-sized cities. The aim of this paper is to establish and strengthen the discourse on the need for SMCs perspective in Smart Cities literature. We hope to provide an initial formulation of the problem, mainly focusing on the unique needs and the specific requirements. We expect that the three example cases describing the effects of applying new solutions and studying SC on small and medium-sized cities, together with the lessons learnt from these experiences, will encourage more research to consider SMCs perspective. To this end, the current paper aims to justify the need for this under-studied perspective, as well as to propose interesting challenges faced by SMCs that can serve as initial directions of such research.

    Download full text (pdf)
    fulltext
  • 14.
    Orand, Abbas
    et al.
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS).
    Erdal Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research.
    Miyasaka, Hiroyuki
    Department of Rehabilitation, Fujita Health University, Nanakuri Memorial Hospital, Tsu, Japan.
    Weeks Levy, Carolyn
    Schools of Mechatronics Systems Engineering and Engineering Science, Simon Fraser University, Surrey, Canada.
    Zhang, Xin
    Schools of Mechatronics Systems Engineering and Engineering Science, Simon Fraser University, Surrey, Canada.
    Menon, Carlo
    Schools of Mechatronics Systems Engineering and Engineering Science, Simon Fraser University, Surrey, Canada.
    Bilateral Tactile Feedback-Enabled Training for Stroke Survivors Using Microsoft KinectTM2019In: Sensors, E-ISSN 1424-8220, Vol. 19, no 16, article id 3474Article in journal (Refereed)
    Abstract [en]

    Rehabilitation and mobility training of post-stroke patients is crucial for their functional recovery. While traditional methods can still help patients, new rehabilitation and mobility training methods are necessary to facilitate better recovery at lower costs. In this work, our objective was to design and develop a rehabilitation training system targeting the functional recovery ofpost-stroke users with high efficiency. To accomplish this goal, we applied a bilateral training method, which proved to be effective in enhancing motor recovery using tactile feedback for the training. One participant with hemiparesis underwent six weeks of training. Two protocols, “contralater alarm matching” and “both arms moving together”, were carried out by the participant. Each ofthe protocols consisted of “shoulder abduction” and “shoulder flexion” at angles close to 30 and 60 degrees. The participant carried out 15 repetitions at each angle for each task. For example, in the“contralateral arm matching” protocol, the unaffected arm of the participant was set to an angle close to 30 degrees. He was then requested to keep the unaffected arm at the specified angle while trying to match the position with the affected arm. Whenever the two arms matched, a vibration was given on both brachialis muscles. For the “both arms moving together” protocol, the two arms were first set approximately to an angle of either 30 or 60 degrees. The participant was asked to return both arms to a relaxed position before moving both arms back to the remembered specified angle.The arm that was slower in moving to the specified angle received a vibration. We performed clinical assessments before, midway through, and after the training period using a Fugl-Meyer assessment (FMA), a Wolf motor function test (WMFT), and a proprioceptive assessment. For the assessments, two ipsilateral and contralateral arm matching tasks, each consisting of three movements (shoulder abduction, shoulder flexion, and elbow flexion), were used. Movements were performed at two angles, 30 and 60 degrees. For both tasks, the same procedure was used. For example, in the case of the ipsilateral arm matching task, an experimenter positioned the affected arm of the participant at 30 degrees of shoulder abduction. The participant was requested to keep the arm in that positionfor ~5 s before returning to a relaxed initial position. Then, after another ~5-s delay, the participant moved the affected arm back to the remembered position. An experimenter measured this shoulder abduction angle manually using a goniometer. The same procedure was repeated for the 60 degree angle and for the other two movements. We applied a low-cost Kinect to extract the participant’s body joint position data. Tactile feedback was given based on the arm position detected by the Kinect sensor. By using a Kinect sensor, we demonstrated the feasibility of the system for the training ofa post-stroke user. The proposed system can further be employed for self-training of patients at home. The results of the FMA, WMFT, and goniometer angle measurements showed improvements in several tasks, suggesting a positive effect of the training system and its feasibility for further application for stroke survivors’ rehabilitation. © 2019 by the authors.

    Download full text (pdf)
    fulltext
  • 15.
    Rezk, Nesma
    et al.
    Halmstad University, School of Information Technology.
    Nordström, Tomas
    Umeå University, Umeå, Sweden.
    Stathis, Dimitrios
    KTH University, Stockholm, Sweden.
    Ul-Abdin, Zain
    Halmstad University, School of Information Technology.
    Aksoy, Eren
    Halmstad University, School of Information Technology.
    Hemani, Ahmed
    KTH University, Stockholm, Sweden.
    MOHAQ: Multi-Objective Hardware-Aware Quantization of recurrent neural networks2022In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 133, article id 102778Article in journal (Refereed)
    Abstract [en]

    The compression of deep learning models is of fundamental importance in deploying such models to edge devices. The selection of compression parameters can be automated to meet changes in the hardware platform and application. This article introduces a Multi-Objective Hardware-Aware Quantization (MOHAQ) method, which considers hardware performance and inference error as objectives for mixed-precision quantization. The proposed method feasibly evaluates candidate solutions in a large search space by relying on two steps. First, post-training quantization is applied for fast solution evaluation (inference-only search). Second, we propose the ”beacon-based search” to retrain selected solutions only and use them as beacons to estimate the effect of retraining on other solutions. We use speech recognition models on TIMIT dataset. Experimental evaluations show that Simple Recurrent Unit (SRU)-based models can be compressed up to 8x by post-training quantization without any significant error increase. On SiLago, we found solutions that achieve 97% and 86% of the maximum possible speedup and energy saving, with a minor increase in error on an SRU-based model. On Bitfusion, the beacon-based search reduced the error gain of the inference-only search on SRU-based models and Light Gated Recurrent Unit (LiGRU)-based model by up to 4.9 and 3.9 percentage points, respectively.

  • 16.
    Rosberg, Felix
    et al.
    Berge Consulting, Gothenburg, Sweden.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).
    Alonso-Fernandez, Fernando
    Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).
    Englund, Cristofer
    Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).
    FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping2023In: Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023, Piscataway: IEEE, 2023, p. 3443-3452Conference paper (Refereed)
    Abstract [en]

    In this work, we present a new single-stage method for subject agnostic face swapping and identity transfer, named FaceDancer. We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR). The AFFA module is embedded in the decoder and adaptively learns to fuse attribute features and features conditioned on identity information without requiring any additional facial segmentation process. In IFSR, we leverage the intermediate features in an identity encoder to preserve important attributes such as head pose, facial expression, lighting, and occlusion in the target face, while still transferring the identity of the source face with high fidelity. We conduct extensive quantitative and qualitative experiments on various datasets and show that the proposed FaceDancer outperforms other state-of-the-art networks in terms of identityn transfer, while having significantly better pose preservation than most of the previous methods. © 2023 IEEE.

    Download full text (pdf)
    fulltext
  • 17.
    Rosberg, Felix
    et al.
    Halmstad University, School of Information Technology. Berge Consulting, Gothenburg, Sweden.
    Aksoy, Eren
    Halmstad University, School of Information Technology.
    Englund, Cristofer
    Halmstad University, School of Information Technology.
    Alonso-Fernandez, Fernando
    Halmstad University, School of Information Technology.
    FIVA: Facial Image and Video Anonymization and Anonymization Defense2023In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Los Alamitos, CA: IEEE, 2023, p. 362-371Conference paper (Refereed)
    Abstract [en]

    In this paper, we present a new approach for facial anonymization in images and videos, abbreviated as FIVA. Our proposed method is able to maintain the same face anonymization consistently over frames with our suggested identity-tracking and guarantees a strong difference from the original face. FIVA allows for 0 true positives for a false acceptance rate of 0.001. Our work considers the important security issue of reconstruction attacks and investigates adversarial noise, uniform noise, and parameter noise to disrupt reconstruction attacks. In this regard, we apply different defense and protection methods against these privacy threats to demonstrate the scalability of FIVA. On top of this, we also show that reconstruction attack models can be used for detection of deep fakes. Last but not least, we provide experimental results showing how FIVA can even enable face swapping, which is purely trained on a single target image. © 2023 IEEE.

  • 18.
    Rothfuss, Jonas
    et al.
    Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
    Ferreira, Fabio
    Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research. Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
    Zhou, You
    Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
    Asfour, Tamim
    Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
    Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution2018In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 3, no 4, p. 4007-4014Article in journal (Refereed)
    Abstract [en]

    We present a novel deep neural network architecture for representing robot experiences in an episodic-like memory that facilitates encoding, recalling, and predicting action experiences. Our proposed unsupervised deep episodic memory model as follows: First, encodes observed actions in a latent vector space and, based on this latent encoding, second, infers most similar episodes previously experienced, third, reconstructs original episodes, and finally, predicts future frames in an end-to-end fashion. Results show that conceptually similar actions are mapped into the same region of the latent vector space. Based on these results, we introduce an action matching and retrieval mechanism, benchmark its performance on two large-scale action datasets, 20BN-something-something and ActivityNet and evaluate its generalization capability in a real-world scenario on a humanoid robot.

  • 19.
    Tzelepis, Georgies
    et al.
    Institut de Robòtica i Informàtica Industrial (CSIC-UPC), Barcelona, Spain.
    Aksoy, Eren
    Halmstad University, School of Information Technology, Center for Applied Intelligent Systems Research (CAISR).
    Borras, Julia
    Institut de Robòtica i Informàtica Industrial (CSIC-UPC), Barcelona, Spain.
    Alenyà, Guillem
    Institut de Robòtica i Informàtica Industrial (CSIC-UPC), Barcelona, Spain.
    Semantic State Estimation in Robot Cloth Manipulations Using Domain Adaptation from Human Demonstrations2024In: Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP / [ed] Petia Radeva; Antonino Furnari; Kadi Bouatouch; A. Augusto Sousa, Setúbal: SciTePress, 2024, Vol. 4, p. 172-182Conference paper (Refereed)
    Abstract [en]

    Deformable object manipulations, such as those involving textiles, present a significant challenge due to their high dimensionality and complexity. In this paper, we propose a solution for estimating semantic states in cloth manipulation tasks. To this end, we introduce a new, large-scale, fully-annotated RGB image dataset of semantic states featuring a diverse range of human demonstrations of various complex cloth manipulations. This effectively transforms the problem of action recognition into a classification task. We then evaluate the generalizability of our approach by employing domain adaptation techniques to transfer knowledge from human demonstrations to two distinct robotic platforms: Kinova and UR robots. Additionally, we further improve performance by utilizing a semantic state graph learned from human manipulation data. © 2024 by SCITEPRESS – Science and Technology Publications, Lda.

  • 20.
    Tzelepis, Georgios
    et al.
    Volvo Technology AB, VGTT, Gothenburg, Sweden.
    Asif, Ahraz
    Volvo Technology AB, VGTT, Gothenburg, Sweden.
    Baci, Saimir
    Volvo Technology AB, VGTT, Gothenburg, Sweden.
    Cavdar, Selcuk
    Volvo Technology AB, VGTT, Gothenburg, Sweden.
    Erdal Aksoy, Eren
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR - Center for Applied Intelligent Systems Research. Volvo Technology AB, VGTT, Gothenburg, Sweden.
    Deep Neural Network Compression for Image Classification and Object Detection2019Conference paper (Refereed)
    Abstract [en]

    Neural networks have been notorious for being computationally expensive. This is mainly because neural networks are often over-parametrized and most likely have redundant nodes or layers as they are getting deeper and wider. Their demand for hardware resources prohibits their extensive use in embedded devices and puts restrictions on tasks like real-time image classification or object detection. In this work, we propose a network-agnostic model compression method infused with a novel dynamical clustering approach to reduce the computational cost and memory footprint of deep neural networks. We evaluated our new compression method on five different state-of-the-art image classification and object detection networks. In classification networks, we pruned about 95% of network parameters. In advanced detection networks such as YOLOv3, our proposed compression method managed to reduce the model parameters up to 59.70% which yielded 110X less memory without sacrificing much in accuracy.

1 - 20 of 20
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf